Overview statistic tests PB (2024)

Jonas van Nijnatten

contact: J.J.vanNijnatten@uva.nl

source code: https://github.com/jonasvannijnatten/PB_statistiek

laatste update: 23-04-2024

Other R manuals for Psychobiology: https://jonasvannijnatten.github.io/PB_statistiek/

1.1 Versions

software versions used for this tutorial:
- R version 4.3.1 (2023-06-16 ucrt)
- car-package version: 3.1.2 (2023-03-25)
- ez-package version: 4.4.0 (2016-11-01)

1.2 Installation

Install & activate the required packages:

reqPacks = c("markdown","knitr","kableExtra","car","ez","afex", "ggplot2", "gridExtra")if (!require("pacman")) install.packages("pacman")pacman::p_load(reqPacks, character.only = TRUE)

2.1 Independent samplesT-test

Show code for data generation

# generate dataN = 40data.long = data.frame( ID = 1:N, condition = rep(x = c("A","B"), each = N), score = c(rnorm(n = N, mean = 25, sd = 6.5), rnorm(n = N, mean = 35, sd = 6.5)))

data.long
ID	condition	score
1	A	29.60012
2	A	24.31170
3	A	29.81539
4	A	19.74589
5	A	30.93069
1	B	33.87671
2	B	32.30768
3	B	30.25141
4	B	27.86297
5	B	27.60118

Test assumption of normality

by(data = data.long$score, INDICES = data.long$condition, FUN = shapiro.test)

## data.long$condition: A## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.97863, p-value = 0.6388## ## ------------------------------------------------------------ ## data.long$condition: B## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.98827, p-value = 0.9471

Test assumption of equality of variances

leveneTest(y = data.long$score, group = data.long$condition)

## Warning in leveneTest.default(y = data.long$score, group =## data.long$condition): data.long$condition coerced to factor.

## Levene's Test for hom*ogeneity of Variance (center = median)## Df F value Pr(>F)## group 1 1.7776 0.1863## 78

T-test

t.test(formula = score ~ condition, data = data.long, paired=FALSE, alternative="two.sided", var.equal=TRUE)

## ## Two Sample t-test## ## data: score by condition## t = -6.6406, df = 78, p-value = 3.772e-09## alternative hypothesis: true difference in means between group A and group B is not equal to 0## 95 percent confidence interval:## -11.218470 -6.043389## sample estimates:## mean in group A mean in group B ## 25.94737 34.57829

2.2 Paired samplesT-test

Show code for data generation

# generate dataN = 30data.long = data.frame( ID = rep(1:N,2), condition = rep(x = c("A","B"), each = N), score = c(rnorm(n = N, mean = 25, sd = 6.5), rnorm(n = N, mean = 35, sd = 6.5)))

data.long
ID	condition	score
1	A	22.92990
2	A	17.66431
3	A	30.36792
4	A	25.76655
5	A	36.59773
1	B	41.85406
2	B	45.21567
3	B	29.26915
4	B	29.35323
5	B	30.78916

Test assumption of normality

# calculate difference scoresdiffScore = data.long$score[data.long$condition=="A"] - data.long$score[data.long$condition=="B"]shapiro.test(diffScore)

## ## Shapiro-Wilk normality test## ## data: diffScore## W = 0.99293, p-value = 0.999

T-test

t.test(formula = score ~ condition, data = data.long, paired=TRUE, alternative="two.sided", var.equal=TRUE)

## ## Paired t-test## ## data: score by condition## t = -4.8757, df = 29, p-value = 3.583e-05## alternative hypothesis: true mean difference is not equal to 0## 95 percent confidence interval:## -12.918699 -5.283426## sample estimates:## mean difference ## -9.101063

Show code for data generation

# generate dataset.seed(05)nrobs = 100experience = rnorm(n = nrobs, mean = 15, sd = 3)salary = 10000 + ( 5 * experience ) + rnorm(n = nrobs, mean = 0, sd = 100)data.long = data.frame(experience, salary)# calculate correlation coefficient rcorr_coef = cor(x = data.long$experience, y = data.long$salary)rm(list = c("nrobs", "experience", "salary"))

data.long
experience	salary
12.47743	9862.848
19.15308	10209.297
11.23352	10123.747
15.21043	10096.900
20.13432	10094.887

Test assumption of normality

apply(X = data.long, MARGIN = 2, FUN = shapiro.test)

## $experience## ## Shapiro-Wilk normality test## ## data: newX[, i]## W = 0.98711, p-value = 0.445## ## ## $salary## ## Shapiro-Wilk normality test## ## data: newX[, i]## W = 0.99257, p-value = 0.8607

Pearson correlation

cor.test(x=data.long$experience, y=data.long$salary, alternative = "two.sided", method = "pearson")

## ## Pearson's product-moment correlation## ## data: data.long$experience and data.long$salary## t = 2.5509, df = 98, p-value = 0.01229## alternative hypothesis: true correlation is not equal to 0## 95 percent confidence interval:## 0.05584335 0.42510763## sample estimates:## cor ## 0.2495245

Spearman correlation

cor.test(x=data.long$experience, y=data.long$salary, alternative = "two.sided", method = "spearman")

## ## Spearman's rank correlation rho## ## data: data.long$experience and data.long$salary## S = 121190, p-value = 0.006181## alternative hypothesis: true rho is not equal to 0## sample estimates:## rho ## 0.2727873

Show code for data generation

# generate dataset.seed(05)nrobs = 100experience = rnorm(n = nrobs, mean = 15, sd = 3)salary = 10000 + ( 5 * experience ) + rnorm(n = nrobs, mean = 0, sd = 100)data.long = data.frame(experience, salary)# calculate correlation coefficient rcorr_coef = cor(x = data.long$experience, y = data.long$salary)rm(list = c("nrobs", "experience", "salary"))

data.long
experience	salary
12.47743	9862.848
19.15308	10209.297
11.23352	10123.747
15.21043	10096.900
20.13432	10094.887

Fit linear model

linearModel = lm(formula = salary ~ experience, data = data.long)

Test assumption of normality & equal variances

# plot the residuals and look at the distribution around the 0-line and if the spread is equal over all x's.plot(x=linearModel$fitted.values, y=linearModel$residuals, xlab = "predicted", ylab = "residuals"); abline(h=0)

shapiro.test(linearModel$residuals)

## ## Shapiro-Wilk normality test## ## data: linearModel$residuals## W = 0.99208, p-value = 0.8268

Regression test

summary(linearModel)

## ## Call:## lm(formula = salary ~ experience, data = data.long)## ## Residuals:## Min 1Q Median 3Q Max ## -248.24 -61.55 10.13 74.34 273.17 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 9934.674 56.812 174.870 <2e-16 ***## experience 9.437 3.700 2.551 0.0123 * ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 104.4 on 98 degrees of freedom## Multiple R-squared: 0.06226, Adjusted R-squared: 0.05269 ## F-statistic: 6.507 on 1 and 98 DF, p-value: 0.01229

Show code for data generation

set.seed(01)N = 30data.long = data.frame( x = 1:N, y = round( 1 / (1 + exp(-.23*(1:N-10))) + # logistic function of x rnorm(n = N, mean = 0, sd = .3) # add some noise ) # round it to either 1 or 0 )

# fit a logistic model, which is a generalized linear model, hence glm() functionglm.fit <- glm(y ~ x, data = data.long, family = binomial('logit'))# test the logistic model against the NULL modelanova(glm.fit, test="Chisq")

## Analysis of Deviance Table## ## Model: binomial, link: logit## ## Response: y## ## Terms added sequentially (first to last)## ## ## Df Deviance Resid. Df Resid. Dev Pr(>Chi) ## NULL 29 38.191 ## x 1 11.329 28 26.862 0.0007631 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(data.long$x, data.long$y) # plot the measured datalines(formula = glm.fit$fitted.values ~ data.long$x) # plot the fitted model

Toon code om voorbeeld data te genereren

set.seed(05) # set seednrofconds = 3 # set number of conditionsnrofsubs = 20 # set number of subjectssubj = as.factor(1:(nrofsubs*nrofconds)) # create array with subject IDscondition = as.factor(rep(LETTERS[1:nrofconds],each=nrofsubs)) # create array with condition valuesscore = as.vector( replicate( nrofconds , rnorm(n = nrofsubs, mean = sample(8,1)+10 , sd = sample(5,1) ) ) ) # create array with measurement valuesdata.long = data.frame(subj, condition, score); # combine arrays into a data.framerm(list=c("subj", "condition","score","nrofconds","nrofsubs")) # delete unnecessary variables

data.long
subj	condition	score
1	A	16.153078
2	A	8.233524
21	B	18.935924
22	B	17.413522
41	C	15.895668
42	C	13.543431

6.1 Checkingassumptions

Test assumption of normality

by(data = data.long$score, INDICES = data.long$condition, FUN = shapiro.test)

## data.long$condition: A## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.9637, p-value = 0.6201## ## ------------------------------------------------------------ ## data.long$condition: B## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.92868, p-value = 0.1456## ## ------------------------------------------------------------ ## data.long$condition: C## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.94927, p-value = 0.3561

Test equality of variances

leveneTest(y = data.long$score, group = data.long$condition)

## Levene's Test for hom*ogeneity of Variance (center = median)## Df F value Pr(>F) ## group 2 6.6526 0.002531 **## 57 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6.2 Running the test

6.2.1 parametric methods

6.2.1.1 usingezANOVA()

ANOVA using the ezANOVA method

ezModel = ezANOVA(data = data.long, dv = score, between = condition, wid = subj)ezModel$ANOVA

## Effect DFn DFd F p p<.05 ges## 1 condition 2 57 28.42547 2.734255e-09 * 0.4993454

6.2.1.2using aov_ez() (Recommended by Jonas)

ANOVA using the aov_ez() method 1. First webuild the model using the aov_ez() function from theafex package.

afexModel = aov_ez(data = data.long, id = "subj", dv = "score", between = "condition")afexModel$anova_table

## Anova Table (Type 3 tests)## ## Response: score## num Df den Df MSE F ges Pr(>F) ## condition 2 57 4.9476 28.425 0.49935 2.734e-09 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next we test the assumption of normality for all the residuals inthe model at once:

hist(afexModel$lm$residuals)

shapiro.test(afexModel$lm$residuals)

## ## Shapiro-Wilk normality test## ## data: afexModel$lm$residuals## W = 0.98581, p-value = 0.7121

# library(car)qqPlot(afexModel$lm)

Then we test the assumption of equal variences using Levene’sTest:

leveneTest(afexModel$lm)

## Levene's Test for hom*ogeneity of Variance (center = median)## Df F value Pr(>F) ## group 2 6.6526 0.002531 **## 57 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6.2.1.3 usingaov()

ANOVA using the aov method

aovModel = aov(formula = score ~ condition, data = data.long)summary(aovModel)

## Df Sum Sq Mean Sq F value Pr(>F) ## condition 2 281.3 140.64 28.43 2.73e-09 ***## Residuals 57 282.0 4.95 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6.2.1.4 usinglm()

ANOVA using the linear model method

lmModel = lm(formula = score ~ condition, data = data.long)summary(lmModel)

## ## Call:## lm(formula = score ~ condition, data = data.long)## ## Residuals:## Min 1Q Median 3Q Max ## -5.9714 -1.1873 -0.1094 1.2109 5.7148 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 11.4195 0.4974 22.960 < 2e-16 ***## conditionB 5.3034 0.7034 7.540 4.01e-10 ***## conditionC 2.6232 0.7034 3.729 0.000444 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 2.224 on 57 degrees of freedom## Multiple R-squared: 0.4993, Adjusted R-squared: 0.4818 ## F-statistic: 28.43 on 2 and 57 DF, p-value: 2.734e-09

6.2.2 Non-parametricmethods

kruskal.test(score~condition, data.long)

## ## Kruskal-Wallis rank sum test## ## data: score by condition## Kruskal-Wallis chi-squared = 28.542, df = 2, p-value = 6.343e-07

6.3 Post-hocanalysis

When using the aov method, use the Tukey post-hoc test

TukeyHSD(aovModel)

## Tukey multiple comparisons of means## 95% family-wise confidence level## ## Fit: aov(formula = score ~ condition, data = data.long)## ## $condition## diff lwr upr p adj## B-A 5.303422 3.6107737 6.9960712 0.0000000## C-A 2.623183 0.9305343 4.3158318 0.0012725## C-B -2.680239 -4.3728882 -0.9875907 0.0009846

When using the linear model method, use pairwise t-test

pairwise.t.test(x = data.long$score, g = data.long$condition, paired = FALSE, p.adjust.method = "bonferroni")

## ## Pairwise comparisons using t tests with pooled SD ## ## data: data.long$score and data.long$condition ## ## A B ## B 1.2e-09 - ## C 0.0013 0.0010## ## P value adjustment method: bonferroni

Show code for data generation

set.seed(01) # set seednrofcondsf1 = 3 # set number of conditions for factor 1nrofcondsf2 = 2 # set number of conditions for factor 2nrofsubs = nrofcondsf1*nrofcondsf2*30 # set number of subjects per conditionsubj = as.factor(1:(nrofsubs)) # create array with subject IDs# create array witht treatment conditionstreatment = as.factor(rep(LETTERS[1:nrofcondsf1],each=nrofsubs/nrofcondsf1)) # create array with control / experimentalcontrol = as.factor(rep(c("control","experimental"),times=nrofsubs/nrofcondsf2)) # create array with measurement valuesscore = as.vector( replicate(nrofcondsf1, replicate ( nrofcondsf2 , rnorm( n = (nrofsubs/(nrofcondsf1*nrofcondsf2)), mean = 0 , sd = sample(5,1) ) + sample(8,1)+10 ) ) ) # combine arrays into a data.framedata.long = data.frame(subj, score, treatment, control); # delete unnecessary arraysrm(list=c("control","nrofcondsf1","nrofcondsf2","nrofsubs","score","subj","treatment"))

data.long
subj	score	treatment	control
1	11.673767	A	control
2	13.329799	A	experimental
61	13.484527	B	control
62	10.149803	B	experimental
121	10.811829	C	control
122	7.591698	C	experimental

7.1 Fitting themodel

Fit model with multiple prediction factors

model = lm(formula = score ~ treatment+control, data = data.long)

7.2 Checkingassumptions

Test assumption of normality

hist(model$residuals)plot(x = fitted(model), y = residuals(model), xlab="predicted", ylab="residuals"); abline(h=0)by(data = data.long$score, INDICES = paste(data.long$treatment, data.long$control), FUN = shapiro.test)

Show residuals histogram

Show residuals vs predicted plot

Show output of Shapiro-Wilk test

## paste(data.long$treatment, data.long$control): A control## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.9882, p-value = 0.9787## ## ------------------------------------------------------------ ## paste(data.long$treatment, data.long$control): A experimental## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.98063, p-value = 0.8421## ## ------------------------------------------------------------ ## paste(data.long$treatment, data.long$control): B control## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.97482, p-value = 0.6773## ## ------------------------------------------------------------ ## paste(data.long$treatment, data.long$control): B experimental## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.92252, p-value = 0.0312## ## ------------------------------------------------------------ ## paste(data.long$treatment, data.long$control): C control## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.92379, p-value = 0.03368## ## ------------------------------------------------------------ ## paste(data.long$treatment, data.long$control): C experimental## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.92347, p-value = 0.03302

Test equality of variances

leveneTest(y = data.long$score, group = as.factor(paste(data.long$treatment, data.long$control)))

## Levene's Test for hom*ogeneity of Variance (center = median)## Df F value Pr(>F) ## group 5 4.1737 0.001318 **## 174 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

7.3 Running the test

ANOVA with the linear model method

model = lm(formula = score ~ treatment*control, data = data.long)library(car)Anova(mod = model, type = 'II')

## Anova Table (Type II tests)## ## Response: score## Sum Sq Df F value Pr(>F) ## treatment 258.86 2 13.9977 2.307e-06 ***## control 42.48 1 4.5938 0.03348 * ## treatment:control 68.54 2 3.7065 0.02652 * ## Residuals 1608.91 174 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(model)

## ## Call:## lm(formula = score ~ treatment * control, data = data.long)## ## Residuals:## Min 1Q Median 3Q Max ## -9.4199 -1.3624 0.2495 1.4918 10.5877 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 12.15286 0.55518 21.890 < 2e-16 ***## treatmentB 0.02103 0.78514 0.027 0.97866 ## treatmentC -1.88866 0.78514 -2.406 0.01720 * ## controlexperimental -0.30843 0.78514 -0.393 0.69492 ## treatmentB:controlexperimental 2.94763 1.11035 2.655 0.00867 ** ## treatmentC:controlexperimental 0.89236 1.11035 0.804 0.42268 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 3.041 on 174 degrees of freedom## Multiple R-squared: 0.1869, Adjusted R-squared: 0.1636 ## F-statistic: 8 on 5 and 174 DF, p-value: 8.234e-07

7.4 Post-hocanalysis

When there are one or more significant main-effects but nosignificant interaction effect post-hoc comparisons can be made for eachfactor separately using the pairwise.t.test() function.

Post-hoc comparison for the first factor:

pairwise.t.test( x = data.long$score # column name of the dependent variable ,g = data.long$treatment # column name of the first within-subjects variable , paired = FALSE # since it is a paired design, set paired=TRUE , p.adjust.method = "bonferroni" # use bonferroni corrected p-values )

## ## Pairwise comparisons using t tests with pooled SD ## ## data: data.long$score and data.long$treatment ## ## A B ## B 0.028 - ## C 0.036 2e-06## ## P value adjustment method: bonferroni

Post-hoc comparison for the first factor:

pairwise.t.test( x = data.long$score # column name of the dependent variable ,g = data.long$control # column name of the second within-subjects variable , paired = FALSE # since it is a paired design, set paired=TRUE , p.adjust.method = "bonferroni" # use bonferroni corrected p-values )

## ## Pairwise comparisons using t tests with pooled SD ## ## data: data.long$score and data.long$control ## ## control## experimental 0.05 ## ## P value adjustment method: bonferroni

When the interaction effect turns out significant, pairwisecomparisons can be made between all combinations of conditions.

Post-hoc comparison for the interaction effect:

pairwise.t.test( x = data.long$score # column name of the dependent variable # merge both factor columns into a combined factor ,g = paste(data.long$treatment, data.long$control) , paired = FALSE # since it is a paired design, set paired=TRUE , p.adjust.method = "bonferroni" # use bonferroni corrected p-values )

## ## Pairwise comparisons using t tests with pooled SD ## ## data: data.long$score and paste(data.long$treatment, data.long$control) ## ## A control A experimental B control B experimental C control## A experimental 1.0000 - - - - ## B control 1.0000 1.0000 - - - ## B experimental 0.0131 0.0032 0.0143 - - ## C control 0.2580 0.6854 0.2402 4.7e-07 - ## C experimental 1.0000 1.0000 1.0000 1.7e-05 1.0000 ## ## P value adjustment method: bonferroni

Show code for data generation

# Generate datasetset.seed(01) # set seednrofsubs = 20 # set number of subjectsdata.wide = data.frame( subj = as.factor(1:nrofsubs) , A = rnorm(n = nrofsubs, mean = sample(8,1)+10 , sd = sample(5,1)) , B = rnorm(n = nrofsubs, mean = sample(8,1)+10 , sd = sample(5,1)) , C = rnorm(n = nrofsubs, mean = sample(8,1)+10 , sd = sample(5,1)))rm(list=c("nrofsubs")) # delete arrays# reshape the data to the long format for some analysis methodsdata.long = reshape( data = data.wide ,direction = 'long' # richting van de data-transformatie ,varying = c('A','B','C') # kolomnamen die worden samengevoegd ,idvar = 'subj' # kolomnaam met de ppn-nummers ,v.names = 'score' # naam van nieuwe kolom met meetwaarden ,timevar = 'condition' # naam van nieuwe kolom met condities ,times = c('A','B','C') # waardes )data.long$condition = as.factor(data.long$condition)

data.wide
subj	A	B	C
1	11.734573	17.298260	13.226653
2	7.657486	9.042593	8.244977
3	17.381123	19.479303	8.170019
4	12.318031	16.775485	12.458328

data.long
subj	condition	score
1	A	11.734573
2	A	7.657486
1	B	17.298260
2	B	9.042593
1	C	13.226653
2	C	8.244977

8.1 Fitting the model

8.1.1 Parametric Test

8.1.1.1 UsingezAONVA

First we fit the model using the ezANOVA() function:

ezModel = ezANOVA(data = data.long, dv = score, wid = subj, within = condition)

Then we check for the assumption of normality. Because the modeltakes into account individual differences, we should first correct theindividual score by removing the individual differences from the data.We do this by calculating how each individual differes from the overallmean.

\(corrected = Y_{i,j}corrected -(\overline{Yi} - \overline{Y})\)

where \(i\) represents theindividual subjects, and \(j\)represents the different conditions

# initialize variables to store calculate correted scoredata.long$subjMean = rep(NA, dim(data.long)[1])data.long$correction = rep(NA, dim(data.long)[1])data.long$corrected = rep(NA, dim(data.long)[1])for (i in unique(data.long$subj)){ # calculate the mean score for each subject (across conditions) data.long$subjMean[which(data.long$subj==i)] = mean(data.long$score[which(data.long$subj==i)]) # calculate the correction as the difference between the subject mean vs the overall mean data.long$correction[which(data.long$subj==i)] = (data.long$subjMean[which(data.long$subj==i)] - mean(data.long$score)) # calculate the corrected score by subtracting the correction per subject from each individual score value data.long$corrected[which(data.long$subj==i)] = data.long$score[which(data.long$subj==i)] - data.long$correction[which(data.long$subj==i)]}p1 = ggplot(data.long, aes(x=condition, y=score, group=1, colour=subj)) + geom_point () + geom_line ( linetype= "dashed", aes(group=subj) ) + stat_summary ( geom = "line", fun = "mean" , linewidth=2, colour="black", linetype="solid") + stat_summary ( geom = "point", fun = "mean" , size=2, colour="black") + geom_errorbar( stat="summary", fun.data="mean_se", linewidth=1, fun.args = 2, width = 0.3 ) + ylim(0, 25) + guides(color = "none")p2 = ggplot(data.long, aes(x=condition, y=corrected, group=1, colour=subj)) + geom_point () + geom_line ( linetype= "dashed", aes(group=subj) ) + stat_summary ( geom = "line", fun = "mean" , linewidth=2, colour="black", linetype="solid") + stat_summary ( geom = "point", fun = "mean" , size=2, colour="black") + geom_errorbar( stat="summary", fun.data="mean_se", linewidth=1, fun.args = 2, width = 0.3 ) + ylim(0, 25) + guides(color = "none")grid.arrange(p1, p2, nrow = 1)

Then we can use the corrected data to test for normality:

by(data = data.long$corrected, INDICES = data.long$condition, FUN = shapiro.test)par(mfrow=c(1,3))by(data = data.long$corrected, INDICES = data.long$condition, FUN = hist)par(mfrow=c(1,1))

## data.long$condition: A## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.93824, p-value = 0.2221## ## ------------------------------------------------------------ ## data.long$condition: B## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.94532, p-value = 0.3015## ## ------------------------------------------------------------ ## data.long$condition: C## ## Shapiro-Wilk normality test## ## data: dd[x, ]## W = 0.9478, p-value = 0.3349

Then we check the assumption of spherecity using Mauchly’s test(which is built into the ezANOVA)

ezModel$`Mauchly's Test for Sphericity`

## Effect W p p<.05## 2 condition 0.7520886 0.07698763

If the assumption of spherecity is violated, you can use thecorrected output of the ezANOVA() function. You don’ need terefer to a non-parametric alternative.

ezModel$`Sphericity Corrections`

## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05## 2 condition 0.8013389 0.0006974823 * 0.8637409 0.0004837266 *

Otherwise we can use the default output as results:

ezModel$ANOVA

## Effect DFn DFd F p p<.05 ges## 2 condition 2 38 10.61008 0.0002182557 * 0.2675837

8.1.1.2Using aov_ez() (recommended by Jonas)

First we fit the model using aov_ez()

library(afex)afexModel = aov_ez(id = "subj", dv = "score", data = data.long, within = "condition", )

Next we test the assumption of normality for all the residuals inthe model at once:

hist(afexModel$lm$residuals)

shapiro.test(afexModel$lm$residuals)

## ## Shapiro-Wilk normality test## ## data: afexModel$lm$residuals## W = 0.9868, p-value = 0.7629

The aov_ez() function automatically corrects the output forsphericity. The degree of the correction depends on the degree of(non)-sphericity. So when the data is perfectly spherical, thecorrection is zero.

afexModel

## Anova Table (Type 3 tests)## ## Response: score## Effect df MSE F ges p.value## 1 condition 1.60, 30.45 15.60 10.61 *** .268 <.001## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1## ## Sphericity correction method: GG

8.1.1.3 Usinglm()

fit linear model

model = lm(formula = cbind(data.wide$A, data.wide$B, data.wide$C)~1)

Checking assumptions

Test assumption of normality

shapiro.test(model$residuals)hist(model$residuals)

## ## Shapiro-Wilk normality test## ## data: model$residuals## W = 0.9868, p-value = 0.7629

Test assumption of sphericity

mauchly.test(model, X=~1)

## ## Mauchly's test of sphericity## Contrasts orthogonal to## ~1## ## ## data: SSD matrix from lm(formula = cbind(data.wide$A, data.wide$B, data.wide$C) ~ 1)## W = 0.75209, p-value = 0.07699

Running the test

Test the model using ANOVA

anova(model, X = ~1, test="Spherical")

## Analysis of Variance Table## ## ## Contrasts orthogonal to## ~1## ## Greenhouse-Geisser epsilon: 0.8013## Huynh-Feldt epsilon: 0.8637## ## Df F num Df den Df Pr(>F) G-G Pr H-F Pr## (Intercept) 1 10.61 2 38 0.00021826 0.00069748 0.00048373## Residuals 19

8.1.2 Non-parametrictest

When the assumptions of parametric testing are violated the FriedmanANOVA can be used: (Note: when the assumption of sphericity is violatedyou can still run an ANOVA and use the Greenhouse-Geisser corrected F-and p-value.)

FMaov = friedman.test(score~condition|subj, data.long)

8.2 Post-hoc analysis

8.2.1 parametric

For post-hoc analysis the data has to be transformed to the longformat

Show code for reshaping

# reshape into wide formatdata.long = reshape(data = data.wide , direction = 'long' , varying = c("A","B","C") , timevar = 'treatment' , times = c("A","B","C") , v.names = c("score") , idvar = "subj" )

Use pairwise T-test to compare each combination of conditions

pairwise.t.test( x = data.long$score # column name of the dependent variable ,g = data.long$treatment # column name of the first within-subjects variable , paired = TRUE # since it is a paired design, set paired=TRUE , p.adjust.method = "bonferroni" # use bonferroni corrected p-values )

## ## Pairwise comparisons using paired t tests ## ## data: data.long$score and data.long$treatment ## ## A B ## B 9.4e-05 - ## C 1.0000 0.0031## ## P value adjustment method: bonferroni

8.2.2 Non-parametric

For post-hoc analysis the data has to be transformed to the longformat

Show code for reshaping

# reshape into wide formatdata.long = reshape(data = data.wide , direction = 'long' , varying = c("A","B","C") , timevar = 'treatment' , times = c("A","B","C") , v.names = c("score") , idvar = "subj" )

For non-parametric post-hoc test the pairwise Wilcoxon tests can beused as an alternative to the pairwise t-tests. The default method tocorrect the p-value for multiple comparisons is the Holm method(see help file).

pairwise.wilcox.test(x = data.long$score, g =data.long$condition, paired = TRUE)

## ## Pairwise comparisons using ## ## data: data.long$score and data.long$condition ## ## <0 x 0 matrix>## ## P value adjustment method: holm

Show code for data generation

set.seed(02) # set seednrofsubs = 20 # set number of subjects data.wide = data.frame( ID=1:nrofsubs, preA=rnorm(n=nrofsubs, mean = 09,sd = 3), preB=rnorm(n=nrofsubs, mean = 10,sd = 3), preC=rnorm(n=nrofsubs, mean = 11,sd = 3), postA=rnorm(n=nrofsubs, mean = 11,sd = 3), postB=rnorm(n=nrofsubs, mean = 15,sd = 3), postC=rnorm(n=nrofsubs, mean = 20,sd = 3))rm(list="nrofsubs")

data.wide
ID	preA	preB	preC	postA	postB	postC
1	6.309256	16.272458	9.849241	5.635273	17.987954	23.22338
2	9.554548	6.400222	5.122691	17.093728	9.912705	20.78179
3	13.763536	14.768915	8.474885	8.890567	13.399884	19.05718
4	5.608873	15.863955	16.710642	11.474494	10.883192	17.75111

9.1 Fitting themodel

Fit linear model to the data, not yet seperating both independentfactors:

model = lm(cbind(data.wide$preA, data.wide$preB, data.wide$preC, data.wide$postA, data.wide$postB, data.wide$postC )~1 )

Make a factor matrix clarifying the conditions. Name the colomns asthe factors. Use the parts of the column names of the data.frame in thewide format as variable values in the matrix.

factorMatrix = data.frame( time=as.factor(rep(c("pre","post"),each=3)), condition = as.factor(rep(c("A","B","C"),2)))

fit linear model specifying the two independent variables using thefactorMatrix

library(car)factorialAnova = Anova(mod=model, idata=factorMatrix, idesign= ~time*condition, type='III')

9.2 Checkingassumptions

shapiro.test(model$residuals)hist(model$residuals)

## ## Shapiro-Wilk normality test## ## data: model$residuals## W = 0.98072, p-value = 0.08312

9.3 Running the test

run the factorial ANOVA

summary(object = factorialAnova, multivariate=FALSE)

## Warning in summary.Anova.mlm(object = factorialAnova, multivariate = FALSE): HF## eps > 1 treated as 1

## ## Univariate Type III Repeated-Measures ANOVA Assuming Sphericity## ## Sum Sq num Df Error SS den Df F value Pr(>F) ## (Intercept) 19546.1 1 176.39 19 2105.421 < 2.2e-16 ***## time 733.4 1 246.98 19 56.421 4.218e-07 ***## condition 777.6 2 483.72 38 30.543 1.235e-08 ***## time:condition 400.6 2 417.36 38 18.235 2.807e-06 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## ## Mauchly Tests for Sphericity## ## Test statistic p-value## condition 0.98514 0.87398## time:condition 0.96336 0.71466## ## ## Greenhouse-Geisser and Huynh-Feldt Corrections## for Departure from Sphericity## ## GG eps Pr(>F[GG]) ## condition 0.98536 1.541e-08 ***## time:condition 0.96466 3.983e-06 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## HF eps Pr(>F[HF])## condition 1.098534 1.235183e-08## time:condition 1.071609 2.806538e-06

The Mauchly Test checks each factor on the assumption of sphericity.If the Mauchly Test shows a significant result for any of the factors orinteraction between factors then a corrected p=value should be usedinstead of the normal, uncorrected p-value.

9.4 Post-hocanalysis

For post-hoc analysis the data has to be transformed to the longformat

Show code for reshaping

# reshape into wide formatdata.mix = reshape(data = data.wide , direction = 'long' , varying = list(c('preA','preB','preC'),c('postA','postB','postC')) , timevar = 'treatment' , times = c("A","B","C") , v.names = c("pre","post") , idvar = "ID" )data.long = reshape(data = data.mix , direction = 'long' , varying = c("pre","post") , timevar = 'pre_post' , times = c("pre","post") , v.names = "score" , idvar = c("ID", "treatment") )

When there are one or more significant main-effects but nosignificant interaction effect post-hoc comparisons can be made for eachfactor separately using the pairwise.t.test() function.

Post-hoc comparison for the first factor:

pairwise.t.test( x = data.long$score # column name of the dependent variable ,g = data.long$treatment # column name of the first within-subjects variable , paired = TRUE # since it is a paired design, set paired=TRUE , p.adjust.method = "bonferroni" # use bonferroni corrected p-values )

## ## Pairwise comparisons using paired t tests ## ## data: data.long$score and data.long$treatment ## ## A B ## B 0.06089 - ## C 1.2e-06 0.00012## ## P value adjustment method: bonferroni

Post-hoc comparison for the first factor:

pairwise.t.test( x = data.long$score # column name of the dependent variable ,g = data.long$pre_post # column name of the second within-subjects variable , paired = TRUE # since it is a paired design, set paired=TRUE , p.adjust.method = "bonferroni" # use bonferroni corrected p-values )

## ## Pairwise comparisons using paired t tests ## ## data: data.long$score and data.long$pre_post ## ## post ## pre 3.1e-08## ## P value adjustment method: bonferroni

When the interaction effect turns out significant, pairwisecomparisons can be made between all combinations of conditions.

Post-hoc comparison for the interaction effect:

pairwise.t.test( x = data.long$score # column name of the dependent variable # merge both factor columns into a combined factor ,g = paste(data.long$treatment, data.long$pre_post) , paired = TRUE # since it is a paired design, set paired=TRUE , p.adjust.method = "bonferroni" # use bonferroni corrected p-values )

## ## Pairwise comparisons using paired t tests ## ## data: data.long$score and paste(data.long$treatment, data.long$pre_post) ## ## A post A pre B post B pre C post ## A pre 1.0000 - - - - ## B post 0.0971 0.0031 - - - ## B pre 1.0000 1.0000 0.0505 - - ## C post 1.3e-08 1.5e-08 1.0e-05 4.8e-07 - ## C pre 1.0000 1.0000 0.2321 1.0000 2.6e-07## ## P value adjustment method: bonferroni

Show code for data generation

set.seed(02) # set seednrofcondsf1 = 3 # set number of conditions for factor 1nrofcondsf2 = 2 # set number of conditions for factor 2nrofsubs = 30 # set number of subjects per conditionsubj = as.factor(rep(x= 1:(nrofsubs*nrofcondsf2), times=nrofcondsf1)) # create array with subject IDs# create array witht treatment conditionstreatment = as.factor(rep(LETTERS[1:nrofcondsf1],each=nrofsubs*nrofcondsf2)) # create array with control / experimentalgender = as.factor(rep(c("men","women"),times=nrofsubs*nrofcondsf1)) # create array with measurement valuesscore = as.vector( replicate(nrofcondsf1, replicate ( nrofcondsf2 , rnorm( n = (nrofsubs), mean = 0 , sd = sample(5,1) ) + sample(8,1)+10 ) ) ) # combine arrays into a data.framedata.long = data.frame(subj, score, treatment, gender); # delete unnecessary arraysrm(list=c("gender","nrofcondsf1","nrofcondsf2","nrofsubs","score","subj","treatment"))

data.long
subj	score	treatment	gender
1	14.65620	A	men
2	7.19054	A	women
1	11.15816	B	men
2	11.50623	B	women
1	12.53758	C	men
2	11.33041	C	women

10.1 Fitting themodel

Fit mixed-model using ezANOVA() function

mix.model = ezANOVA(data = data.long # name of the data frame (long format) , dv = score # column name of the dependent variabele , wid = subj # column name indicating subject ID , within = treatment # column name of the within factor , between = gender # column name of the between factor , return_aov = TRUE # add information to the output for normality testing )

10.2 Checkingassumptions

Test assumption of normality of the residuals

shapiro.test(mix.model$aov$`subj:treatment`$residuals)hist(mix.model$aov$`subj:treatment`$residuals , xlab = 'Residuals' , main = 'Histogram of residuals')

## ## Shapiro-Wilk normality test## ## data: mix.model$aov$`subj:treatment`$residuals## W = 0.98601, p-value = 0.2525

Test assumption of sphericity

mix.model$`Mauchly's Test for Sphericity`

## Effect W p p<.05## 3 treatment 0.7874388 0.001101995 *## 4 gender:treatment 0.7874388 0.001101995 *

10.3 Running thetest

If the assumption of sphericity is met use the normal, uncorrectedANOVA output

mix.model$ANOVA

## Effect DFn DFd F p p<.05 ges## 2 gender 1 58 0.02901668 0.8653333476 0.0001968976## 3 treatment 2 116 9.36206393 0.0001701106 * 0.0891490785## 4 gender:treatment 2 116 0.06453384 0.9375380257 0.0006742059

If the assumption of sphericity is violated, use the corrected ANOVAoutput (The Greenhouse-Geisser correction is more conservative than theHuynd-Feldt correction)

mix.model$`Sphericity Corrections`

## Effect GGe p[GG] p[GG]<.05 HFe p[HF]## 3 treatment 0.8247007 0.0004808571 * 0.8457291 0.0004243621## 4 gender:treatment 0.8247007 0.9075524117 0.8457291 0.9118589378## p[HF]<.05## 3 *## 4

10.4 Post-hocanalysis

When there are one or more significant main-effects but nosignificant interaction effect post-hoc comparisons can be made usingthe pairwise.t.test() function.

Post-hoc comparison for the between-subjects factor:

pairwise.t.test( x=data.long$score # column name of the dependent variable, g=data.long$gender # column name of the independent between-subjects variable, paired = TRUE # for the within factor, set paired=TRUE, p.adjust.method = 'bonferroni' # use bonferroni corrected p-values)

## ## Pairwise comparisons using paired t tests ## ## data: data.long$score and data.long$gender ## ## men## women 0.8## ## P value adjustment method: bonferroni

Post-hoc comparison for the within-subjects factor:

pairwise.t.test( x=data.long$score # column name of the dependent variable, g=data.long$treatment # column name of the independent within-subjects variable, paired = TRUE # for the within factor, set paired=TRUE, p.adjust.method = 'bonferroni' # use bonferroni corrected p-values)

## ## Pairwise comparisons using paired t tests ## ## data: data.long$score and data.long$treatment ## ## A B ## B 0.0024 - ## C 0.4130 0.0035## ## P value adjustment method: bonferroni

When the interaction-effect turns out significant, pairwisecomparisons within each factor are not very informative about what theinteraction looks like. Pairwise comparisons between all combinations ofconditions is inappropriate when dealing with mixed designs. The bestway to interpret an interaction effect is by looking at the interactionpattern in the data using descriptive statistics and visualization.

FAQs

How to find p-value with test statistic? ›

The p-value is calculated using the sampling distribution of the test statistic under the null hypothesis, the sample data, and the type of test being done (lower-tailed test, upper-tailed test, or two-sided test). The p-value for: a lower-tailed test is specified by: p-value = P(TS ts | H ₀ is true) = cdf(ts)

Find Out More ›

How to interpret p test results? ›

These are as follows: if the P value is 0.05, the null hypothesis has a 5% chance of being true; a nonsignificant P value means that (for example) there is no difference between groups; a statistically significant finding (P is below a predetermined threshold) is clinically important; studies that yield P values on ...

Find Out More ›

What is the overview of test statistic? ›

The test statistic is a number calculated from a statistical test of a hypothesis. It shows how closely your observed data match the distribution expected under the null hypothesis of that statistical test.

Learn More Now ›

What is the p-value for dummies? ›

Key Takeaways. A p-value is a statistical measurement used to validate a hypothesis against observed data. A p-value measures the probability of obtaining the observed results, assuming that the null hypothesis is true. The lower the p-value, the greater the statistical significance of the observed difference.

Keep Reading ›

What is a good p-value? ›

The p-value can be perceived as an oracle that judges our results. If the p-value is 0.05 or lower, the result is trumpeted as significant, but if it is higher than 0.05, the result is non-significant and tends to be passed over in silence. So what is the p-value really, and why is 0.05 so important?

See Details ›

How to find p-value with test statistic on TI-84? ›

TI-83 or 84

Type in the hypothesized proportion (p0), X, sample size, arrow over to the ≠, <, > sign that is the same in the problems alternative hypothesis statement then press the [ENTER] key, arrow down to [Calculate] and press the [ENTER] key. The calculator returns the z-test statistic and the p-value.

Get More Info Here ›

Is the p-value of 0.03 significant? ›

The p-value obtained from the data is judged against the alpha. If alpha=0.05 and p=0.03, then statistical significance is achieved. If alpha=0.01, and p=0.03, statistical significance is not achieved.

View Details ›

How do you read a P test? ›

If two lines show up, even if the test line (T) is very faint, that's a positive result. If only the control line (C) shows up, the test is negative. That means there is not enough hCG present in the urine for a positive result. Either you're not pregnant or it's too early to test.

Learn More Now ›

Why is my p-value so low? ›

A very small P-value indicates that the null hypothesis is very incompatible with the data that have been collected.

What is the difference between the p-value and the test statistic? ›

The p value, or probability value, tells you how likely it is that your data could have occurred under the null hypothesis. It does this by calculating the likelihood of your test statistic, which is the number calculated by a statistical test using your data.

Learn More Now ›

How do you read a test statistic? ›

We can work out the chances of the result we have obtained happening by chance. If a p-value reported from a t test is less than 0.05, then that result is said to be statistically significant. If a p-value is greater than 0.05, then the result is insignificant.

Discover More Details ›

How to calculate p-value without standard deviation? ›

When the population standard deviation is unknown, use the t -distribution to find the p-value. If the p-value is the area in the left-tail: Use the t. dist function to find the p-value.

Keep Reading ›

How do you explain p-value to someone? ›

The P value is defined as the probability under the assumption of no effect or no difference (null hypothesis), of obtaining a result equal to or more extreme than what was actually observed. The P stands for probability and measures how likely it is that any observed difference between groups is due to chance.

Learn More ›

How do you explain p-value to a child? ›

The p-value is like a score that tells you how likely it is that your car is really faster than the other one, and not just because of luck. The lower the p-value, the more confident you can be that your car is truly faster.

How do you know if something is statistically significant? ›

A study is statistically significant if the P value is less than the pre-specified alpha. Stated succinctly: A P value less than a predetermined alpha is considered a statistically significant result. A P value greater than or equal to alpha is not a statistically significant result.

View Details ›

How to find p-value from z-test statistic? ›

Finding the p value from a z value, using the table with standard normal probabilities

Find the row corresponding to the z value you found up to the first decimal, and the column corresponding to the second decimal. ...
The two sided p value is 2×(1−pleft)

Show Me More ›

How to find p-value from chi-square test statistic? ›

The p-value is equal to one minus the area under the curve corresponding to the chi-square test statistic. So, the p-value can be computed by subtracting 0.90 from 1: P = 1 − 0.90 = 0.10 .

Show Me More ›

How do you find the p-value from the F test statistic? ›

To find the p values for the f test you need to consult the f table. Use the degrees of freedom given in the ANOVA table (provided as part of the SPSS regression output). To find the p values for the t test you need to use the Df2 i.e. df denominator.

Read The Full Story ›

How to get p-value from test statistic excel? ›

In the fx tab above the cells, enter the TTEST's formula =T. TEST(array1, array2, tails, type), replacing array1, array2, tails, and type with values or cell numbers such as B3, B6, B9, and so on, then press the Enter key to calculate the P-Value.

Learn More ›