Research Design &amp; Statistics

Terms

undefined, object
copy deck
independent variable
input variable,
variable that is manipulated by researcher,
IVs effect the DVs
dependent variable
output variable,
measured not manipulated (d=describe),
changes as result of manipulations of IV
internal validity, definition
allows researcher to say there is a causal relationship between IVs and DVs
threats to internal validity
may cause relationship instead of IV
1. history 2. maturation, 3. testing, 4. instrumentation, 5. statistical regression, 6. selection, 7. differences in dropouts & non, 8. experimenter bias
most powerful method of controlling threats to internal validity
random assignment of subjects, ensures equivalency of extraneous factors, "great equalizer"
Other methods of controlling threats to internal validity
1. matching on ex var, 2. blocking - studying ex var as a IV, 3. including only homogenous subj on ex var, 4. ANCOVA - mathematical adj to equalize on ex var
external validity, definition
generalizability of results
threats to external validity
-selection effects
-testing effects
-history effects (tx doesn't gen beyond setting/time conducted)
-demand characteristics (cues in setting)
-Hawthorne Effects
-order effects in repeated measures
how control for threats to external validity
-random selection
-naturalistic/field research
-single/double-blind designs
-control order effects with counterbalancing
true vs quasi experimental design
in quasi subjects are not randomly assigned to groups (unlike true) but variables are manipulated (like true).
correlational design
variables not manipulated, no causal relationship assumed only degree of relationship
developmental research
assess variables as function of dev over time i.e. aging on IQ scores. Types: longitudinal, cross-sectional, cross-sequential
time-series design
DV measured several times, reg intervals, before & after tx administered. History major threat to internal validity of one-group time-series study
single-subject design
single subject, at least one baseline and one tx phase. Types: AB (baseline-tx), reversal, and ABAB (mult baseline)
qualitative (descriptive) research
-collect data to develop theory
-observation, interviews, surveys, case studies
sequence of scientific inquiry
1. hypothesis
2. operational definition
3. collect/analyze data to test hypothesis
predictor variables
independent variables not manipulated, like in correlational designs
criterion variables
dependent variables in non-experimental studies, ex. correlational designs
levels
-refers to values an IV can take
-one IV can have multiple levels, ex. treatment (IV): drug, therapy, drug+therapy (3 levels), gender & IQ: gender (IV w/2 levels, M/F)
factorial design
multiple IV design where you combine every level of one IV with every level of other IV
confound
an extraneous variable that causes changes in DV
experimenter expectancy
a.k.a. Rosenthal effect or Pygmalion effect
-change in behavior result of experimenter expectances rather than IV
-overcome with double-blind techniques
random assignment vs. random selection/sampling
Random selection is method of selecting subjects for study (=chance of participation), random assignment happens after subjects have been selected (=chance of assignment to groups).
blocking vs. matching in controlling internal validity threats
matching ensures equivalency of ex var (no addition of IV) while blocking determines effects of ex var (makes it an IV)
Analysis of Covariance (ANCOVA)
-statistical strategy for increasing internal validity
-post-hoc "matching"
stratified random sampling
taking random sample from each of several subgroups of total target pop (ex. age ranges). Purpose to ensure proportionate representation of defined pop subgroups
cluster sampling
-unit of sampling is naturally occurring groups of ind rather than the ind (ex. random states then school districts then classrooms)
analogue research
conclusions are drawn from lab research about real-world phenomenon (ex. Milgram's obedience to auth studies). often lacks ext val
counterbalancing
controls order effects, subjects receive tx in different order, ex. latin square design
cohort effects
observed differences between age groups may have to do with experience rather than age. problem with cross-sectional designs
cross-sequential design
-combines longitudinal and cross-sectional designs.
-samples of diff age groups assessed on two or more occasions.
-control cohort effects, less time consuming than longitudinal (help w/drop out)
major threat to single-subject designs
-much variablility in target behavior, difficult to establish reliable baseline
protocol analysis
research involving collection and analysis of verbatim reports
ordinal data, def and ex.
-order of categories but not HOW MUCH more/less
-ranks, likert scales
interval data vs. ratio data
interval: equal distances but NO absolute zero point (complete absence of attribute), ex temp, IQ
ratio: same as ordinal but with absolute zero, can mult & divide
negatively skewed distribution
"easy test"
few scores fall at low end (tail on left/negative end w/lump on right)
positively skewed distribution
"hard test"
few scores fall at high end (tail on right/positive end w/lump on left)
mean, median, mode
mean: average, preferred measure of central tendency, sensitive to extremes-"pulled toward tail"
median: Md, middle value when ordered from low to high, not as effected by extreme scores, useful for skewed distributions
Mode: most frequent value, can be more than one (multimodal, bimodal)
variance
-measure of variability of disribution
-average of squared differences of each score from mean
-equal to the square of the standard deviation (s2)
standard deviation
-square root of variance (s)
-expected deviation from mean of a score chosen at random
z-score (standard score)
-how many standard deviations a given raw score is from the mean
-z-score distributions have sd of 1 and mean of 0
linear transformation
-when transformation of scores does not change distribution shape, i.e. raw scores to z-scores
t-scores
-mean of 50 and sd of 10
-z-score of +1 equals t-score of 60
percentile score vs. rank
score referenced to items on test (70% correct), rank referenced to other scores in distribution (70th percentile-70% scored below you)
what is the distribution of percentile ranks?
flat (rectangular) distribution
nonlinear transformation
-converting scores will change shape of distribution, i.e. raw scores to percentile ranks
standard deviation curve stats (for normal distribution)
68% fall between +-1z or sd,
95% fall between +-2z or sd,
+-1z or sd equivalent to PR of 84/16 or top/bottom 16%,
+2z or sd equivalent to 98th PR/top 2%
sampling error
difference between sample mean and population mean (one type)
statistic (sample value) vs. parameter (population value)
standard error of the mean
expected difference between sample mean and population mean,
s.d/square root of N,
inverse relationship bt sample size and std. error of mean
two-tailed vs. one-tailed hypothesis
-two tailed states a mean is different from another mean but do not know in which direcction
-one tailed states mean is either > or < another mean
Type I error
-found difference when there isn't one
-probability of making type I error is alpha,
-level set by researcher in advance
Type II error
-found NO difference when there in fact IS one
-probability of making type II error is beta (usually can't be determined)
power
-probability of declaring there is a difference when one actually exists (rejecting false null hypothesis)
-probability of NOT making TII error or 1-beta
factors that affect power
1. sample size: larger > power
2. Alpha: higher > power (but > chance of making TI error)
3. one tailed > two-tailed
4. > difference between pop means under study
parametric test
-used for interval and ratio data
-t-test and ANOVA
-assumptions: normal distribution, homogeneity of variance, independence of observations (most imp.)
nonparametric test
-used for nominal or ordinal data
-chi-square, Mann-Whitney U
-less powerful than parametric tests
similarity between parametric and nonparametric tests
-share assumption that data come from unbiased sample (random selection)
t-test
-compare two means (t for two)
-one sample: sample mean to known pop mean (df=N-1)
-independent sample: means from two independent samples (df=N (total # subj in study) -2)
-correlated samples: means of two correlated samples (before/after) (df=N (# pairs)-1)
One-way ANOVA
-one IV and 2+ groups/levels
-statistic is F, ratio of between/within group variance
-does not indicate which means are diff (post-hoc tests)
Factorial ANOVA
-2 or more IV and 1 DV
-a.k.a. two-way (2IV) or three-way (3IV) ANOVA
-assmt of both main and interaction effects
MANOVA
-multiple DVs and at least one IV
-adv of reducing TI error over separate ANOVAs
Chi-square
-analyze nominal data
-compares freq of nominal observations to freq expected under null hypo
Mann-Whitney U
-compare two independent groups on DV w/RANK ORDERED DATA (like t-test for independent samples)
Wilcoxon Matched-Pairs Test
-compare two correlated groups on a DV w/RANK ORDERED DATA (like t-test for correlated samples)
Kruskal-Wallis Test
-compare two or more independent groups on DV w/RANK ORDERED DATA (like one-way ANOVA)
critical value
-determine whether or not to reject null hypothesis (table)
- if obtained value exceeds critical value, reject null
-value to use depends on pre-set alpha level and degrees of freedom for statistical test
F ratio (ANOVA)
comparison of between-group variance (tx variance) and within-group variance (error variance)
-desire between group variance to be large (effect of tx) and within-group error to be small
ANOVA summary table
Sum of Squares: variability of set of data (between, within)
DF: between = k(# groups) - 1, within = N-k
Mean Squares: Sum of Squares/DF (illustrates in table the f-ratio)
danger of multiple comparisons (in post-hoc tests)
-increase chance of making TI error
-increases "experiment-wise error rate"
which of the post-hoc tests is the most conservative?
-Scheffe
-decreases TI but increases TII chance
which of post-hoc tests is appropriate if only conducting pairwise comparisons?
-Tukey
-provides enough protection against TI if only pairwise comp are made
main effects vs. interaction effects
-Main: effect of one IV by itself
-Interaction: effects of an IV at different levels of other IVs
marginal vs. cell means
-Factorial ANOVA
-examine main effects by examining difference in marginal means
-examine interaction effects by examining differences in cell means (if move in opposite directions (cross) across both levels of IV then have interaction effect, if move in same direction i.e. both increase or both decrease (parallel) than no interaction effect
what is caution in interpretation when you find interaction effects?
-must interpret the main effects with caution, main effects don't generalize to all levels of other IVs
name 3 cautions in using Chi-square
1. all observations must be independent
2. each observation can only be classified into one category/cell - must be mutually exclusive
3. %s of observations w/i categories cannot be compared (must convert to #'s)
correlation coefficient
-ranges -1.00 and +1.00
-describes magnitude (absolute value) and direction (-or+)
-vary directly (+) or inversely (-)
relationship between correlation and causality
-correlation is a necessary but not sufficient condition of causality, correlation does not guarantee causality but if causal link is established then they must be correlated
Pearson r (PPM)
-calculates the relationship between two variables
-most commonly used correlation coefficient in psychology
What factors affect the Pearson r?
1. linearity: assumes linear rel (not curvilinear)
2. homoscedasticity: assumes = dispersion of scores (not heteroscedasticity)
3. range of scores: wider range will yeild more accurate correlation
coefficient of determination
-squared correlation coefficient
-indicates the percentage of variability in one measure accounted for by variability in other measure
-if IQ & GPA correlate at .70 than 49% of variation in GPA can be explained by variation in IQ (rest is explained by unmeasured factors)
point-biserial and biserial coefficients
-point-biserial: relates one continuous var and one dichotomous var (gender)
-biserial: two continuous var are correlated but one is artifically made dichotomous (high/low)
Phi and Tetrachoric coefficients
-Phi: when both variables are dichotomous
-Tetrachoric: both variables artifically dichotomized
contingency
-correlation between two nominal variables each having 2+ categories (ex. father's eye color and son's eye color)
Spearman's Rho (rank-order corelation)
-correlate two variables ordinally ranked (compare two judges rankings on same set of observations)
eta
-measures nonlinear relationships
regression
when two variables are correlated, allows you to estimate the value of one variable based on value of other
predictor vs. criterion in regression equation
predictor is value given and criterion is the predictee or value you are determining
regression analysis can be used as a substitute for what?
one-way ANOVA
multiple correlation coefficient (Multiple R)
-assesses relationship between two+ predictor var and ONE criterion var
multiple regression
use of scores on more than one predictor to estimate scores on a criterion
the multiple correlation coefficient is has highest predictive power when?
predictor variables are highly correlated with criterion but not each other (multi-collinearity)
multiple correlation coefficient is never lower than what?
-the highest simple correlation bt an ind predictor and the criterion
The multiple correlation coefficent can never be what?
negative
coefficient of multiple determination
R squared
-like the pearson r, this notes the proportion of variance in the criterion variable accounted for by the combo of predictor variables
stepwise multiple regression
-forward and backward
-with each addition of predictor variable determine if predictive power of multiple R has increased
canonical correlation
-used with multiple criterion and multiple predictor variables
discriminant function analysis
-used to predict criterion group membership, not a criterion score (like multiple regression)
differential validity
-when each predictor has different correlation with each criterion variable
logistic regression
-used when required assumptions for discriminant analysis are not met (ex. normal dist, homogeneity)
-predictors can be nominal
-primarily used w/dichotomous DVs or when subj can be classified into one of two criterion groups
multiple cutoff
identifying different cutoff scores on a series of predictors, must score at or above the cutoff on EACH predictor to be predicted as successful on criterion
partial correlation
statistically taking out or "partialling out" effect of a variable to control its effect on correlation
path analysis
-structural equation modeling technique
-verify simpler causal models that propose one-way causal flows between variables
-observed variables only
LISREL
-structural equation modeling technique
-one-way and/or two-way causal relationships
-latent and observed variables

103