EPPP Research / Statistics
Terms
undefined, object
copy deck
- Def: cluster analysis *
- Random selection of naturally occurring groups, rather than individuals
- Def: analog study
- Assessing a phenomenon under conditions that resemble the phenomenon in the field
- Def: a cross-sequential research design
-
A combination of longitudinal and cross-sectional designs
Subjects divided into age groups
Assessed on dependent variable repeatedly over time - Developmental research designs
-
Longitudinal
Cross-sectional
Cross-sequential - Describe a matching design
- Grouping subjects similar on an extraneous variable and then assigning members of the group to each treatment condition
- Describe stratified random sampling
-
Random sampling of sub-groups of a population
eg children, teens, young adults, etc - Describe multiple baseline study *
-
Single subject
Application of treatment across different baselines (behaviors, settings, individuals)
Used when reversal is not possible or is unethical - Describe a one-group time-series design
-
Multiple pre-tests, followed by treatment, followed by multiple post-tests
Controls for maturation, testing and regression effects
Vulnerable to history, or a simultaneously occurring event - Formula: variance (s squared)
-
(sum of (X - mean)squared) / n
population denominator n
sample denominator n-1 - Formula: z score
- (X - mean) / standard deviation
- T score attributes
-
mean = 50
sd = 10 - Stanine attributes
-
Divides score range into equal ninths
Mean = 5
SD = 2 - Formula: standard error of the mean *
-
SE = standard deviation / square root of N
also is SD of the sampling distribution of means
the expected difference between the sample mean and the population mean - Chi square requirements *
-
Independent observations
Mutually exclusive categories
Frequency, not percentage data used - Formula: ANOVA mean square
- mean square = sum of squares / df
- Use: phi coefficient
- Two dichotomous variables
- Use: point-biserial coefficient *
-
One interval or ratio variable
One naturally dichotomous variable (2 categories) - Use: biserial coefficient *
-
One interval or ratio variable
One artifically dichotomous variable (2 categories)
(eg scores above, scores below) - Use: contingency coefficient
- Two nominally scaled variables, each with more than 2 categories
- Use: canonical correlation
-
Multiple predictors and
multiple criterion values - Use: Spearman's rho
- Both predictor and criterion variables are ranked
- Def: coefficient of determination
-
Pearson r squared
% of variability accounted for in the correlation - In an ANOVA, what does within group variance measure
- Random variance
-
Which has the smallest variance?
population samples
individual samples
mean population samples - Mean population samples
- When are non-parametric tests used?
- When normality can't be assumed
- When homogeneity of variance is compromised, the best way to assure result robustness is...?
- To keep sample size equal
- Use: eta correlation
- With non-linear, continuous variables
- Distinguish the use of: t-test, one way ANOVA, factorial ANOVA, MANOVA and ANCOVA *
-
t-test: pair of means
One way ANOVA:
1 independent variable; 2 groups
Factorial ANOVA:
>1 independent variable
permits analysis of interaction effects
MANOVA
>1 dependent variables
minimizes p(Type I error)
ANCOVA
to control for the presence of an extraneous variable - Def: internal validity
- Study that permits the conclusion that there is a causal effect between the independent and dependent variable
- Threats to internal validity
-
History - an external event
Maturation
Test learning
Changes in instrumentation
Statistical regression
Subject characteristics
Systematic differences between stickers and dropouts
Experimenter bias - Methods for controlling threats to internal validity
-
Randomization
Matching
Blocking
Hold extraneous variables constant
ANCOVA - Def: matching
- Grouping subjects by status on extraneous variable and then randomly assigning from within groups
- Def: blocking
- Treating an extraneous variable like another independent variable
- Def: time series design
-
Multiple pre-tests
Treatment
Multiple post-tests
History is a threat to internal validity - Bias in longitudinal studies
-
Tendency to underestimate age related change, esp decrements
Drop outs tend to be poorer performers
Practice effects on measures - Bias in cross sectional studies
-
Over estimation of effects due to aging
Cohort effects
Experience - Def: Type II error (beta) *
-
retaining a false null hypothesis
failing to detect a true effect - Techniques to increase the validity coefficient
- Increase the range of scores
- Def: shrinkage *
- Occurs when predictors are DEVELOPED on one sample and then VALIDATED on another. The correlation coefficient for the second sample is likely to be lower.
- def: power *
-
ability to detect a treatment effect
p (rejecting a false null hypothesis)
p (not making a type II error)
1 - beta - factors affecting power
-
sample size - larger
alpha - larger
one tailed test
magnitude of the population difference - larger - assumptions of parametric tests
-
normal distribution of the dependent variable
homogeneity of variance
independence of observations - most critical - Def: F statistic
- In an ANOVA, the ratio of between group variance over within group variance
- Common non-parametric tests
-
Chi-square - frequencies of nominal data
Mann-Whitney U - non-parametric equivalent of a t-test; 2 independent groups - nominal scores
Wilcoxon Matched-Pairs test - non-parametric equivalent of a t-test for correlated scores
Kruskal-Wallis test - non-parametric alternative to a one-way ANOVA - ANOVA post-hoc tests
-
Scheffe provides greatest protection against a type I error, but increases probability of a type II error
Tukey most appropriate for pairwise comparisons - Calculation of CHI-square expected frequencies
-
simple case = subjects / cells
complex case = column total * row total / total N - Assumptions of Pearson r
-
linear relationship between variables
homoscedasticity - equal variability on y throughout the x range
r is highest when using the full range of scores on both variables - Use: discriminant function analysis
-
scores are combined to determine group assignment
in contrast to multiple regression in which multiple variables are combined to predict a score - Def: differential validity
-
in discriminant analysis, each predictor has a high correlation with a single category criterion and a low correlation with the other category criteria
IQ has low differential validity - Use: structural equation modeling
- testing causal models based on multiple variables
- Techniques of structural equation modeling *
-
Path analysis - one way causal relationship with observed values
LISREL - one or two way causal analysis with both observed and infered variables
helps sort out the contributions of true score and error variance - Use: trend analysis
-
determination of shape of the relationship between variables: eg linear, quadratic, cubic, quartic...
yield the significance of the trend - Def: sampling distribution
-
a distribution of the values of a statistic (eg the mean) with each value computed from same-sized samples drawn with replacement from the population
has less variability than the population distribution - Central limit theorem *
-
1. As sample size increases the shape of the sampling distribution of means approaches a normal shape - even if the distribution of scores is not normal
2.The mean of the sampling distribution of means is equal to the mean of the population - Rosenthal effect
-
aka experimenter expectancy effect
unintentional effect experimenter exerts towards making the results come out right - experiment-wise error rate
- probability of making at least 1 type I error when multiple comparisons are made in a single experiment
- heteroscedasticity
- unequal variability of y scores at different values of x
- Effect on t test when comparison groups are highly correlated
- Within group variability is suppressed, giving an artificially high t value
- Threats to external validity *
- Interaction between selection and treatment would create problems in generalization
- Use: tetrachoric coefficient
- 2 artificially dichotomous variables
- Use: paired t test
-
Analysis of means when groups are not independent (eg twin studies or repeated measures)
df = # of pairs - 1 - ANOVA vs multiple regression
-
ANOVA uses categorical independent variables only
Multiple regression can use either categorical or continuous variables