Research Design & Statistics
Terms
undefined, object
copy deck
 independent variable

input variable,
variable that is manipulated by researcher,
IVs effect the DVs  dependent variable

output variable,
measured not manipulated (d=describe),
changes as result of manipulations of IV  internal validity, definition
 allows researcher to say there is a causal relationship between IVs and DVs
 threats to internal validity

may cause relationship instead of IV
1. history 2. maturation, 3. testing, 4. instrumentation, 5. statistical regression, 6. selection, 7. differences in dropouts & non, 8. experimenter bias  most powerful method of controlling threats to internal validity
 random assignment of subjects, ensures equivalency of extraneous factors, "great equalizer"
 Other methods of controlling threats to internal validity
 1. matching on ex var, 2. blocking  studying ex var as a IV, 3. including only homogenous subj on ex var, 4. ANCOVA  mathematical adj to equalize on ex var
 external validity, definition
 generalizability of results
 threats to external validity

selection effects
testing effects
history effects (tx doesn't gen beyond setting/time conducted)
demand characteristics (cues in setting)
Hawthorne Effects
order effects in repeated measures  how control for threats to external validity

random selection
naturalistic/field research
single/doubleblind designs
control order effects with counterbalancing  true vs quasi experimental design
 in quasi subjects are not randomly assigned to groups (unlike true) but variables are manipulated (like true).
 correlational design
 variables not manipulated, no causal relationship assumed only degree of relationship
 developmental research
 assess variables as function of dev over time i.e. aging on IQ scores. Types: longitudinal, crosssectional, crosssequential
 timeseries design
 DV measured several times, reg intervals, before & after tx administered. History major threat to internal validity of onegroup timeseries study
 singlesubject design
 single subject, at least one baseline and one tx phase. Types: AB (baselinetx), reversal, and ABAB (mult baseline)
 qualitative (descriptive) research

collect data to develop theory
observation, interviews, surveys, case studies  sequence of scientific inquiry

1. hypothesis
2. operational definition
3. collect/analyze data to test hypothesis  predictor variables
 independent variables not manipulated, like in correlational designs
 criterion variables
 dependent variables in nonexperimental studies, ex. correlational designs
 levels

refers to values an IV can take
one IV can have multiple levels, ex. treatment (IV): drug, therapy, drug+therapy (3 levels), gender & IQ: gender (IV w/2 levels, M/F)  factorial design
 multiple IV design where you combine every level of one IV with every level of other IV
 confound
 an extraneous variable that causes changes in DV
 experimenter expectancy

a.k.a. Rosenthal effect or Pygmalion effect
change in behavior result of experimenter expectances rather than IV
overcome with doubleblind techniques  random assignment vs. random selection/sampling
 Random selection is method of selecting subjects for study (=chance of participation), random assignment happens after subjects have been selected (=chance of assignment to groups).
 blocking vs. matching in controlling internal validity threats
 matching ensures equivalency of ex var (no addition of IV) while blocking determines effects of ex var (makes it an IV)
 Analysis of Covariance (ANCOVA)

statistical strategy for increasing internal validity
posthoc "matching"  stratified random sampling
 taking random sample from each of several subgroups of total target pop (ex. age ranges). Purpose to ensure proportionate representation of defined pop subgroups
 cluster sampling
 unit of sampling is naturally occurring groups of ind rather than the ind (ex. random states then school districts then classrooms)
 analogue research
 conclusions are drawn from lab research about realworld phenomenon (ex. Milgram's obedience to auth studies). often lacks ext val
 counterbalancing
 controls order effects, subjects receive tx in different order, ex. latin square design
 cohort effects
 observed differences between age groups may have to do with experience rather than age. problem with crosssectional designs
 crosssequential design

combines longitudinal and crosssectional designs.
samples of diff age groups assessed on two or more occasions.
control cohort effects, less time consuming than longitudinal (help w/drop out)  major threat to singlesubject designs
 much variablility in target behavior, difficult to establish reliable baseline
 protocol analysis
 research involving collection and analysis of verbatim reports
 ordinal data, def and ex.

order of categories but not HOW MUCH more/less
ranks, likert scales  interval data vs. ratio data

interval: equal distances but NO absolute zero point (complete absence of attribute), ex temp, IQ
ratio: same as ordinal but with absolute zero, can mult & divide  negatively skewed distribution

"easy test"
few scores fall at low end (tail on left/negative end w/lump on right)  positively skewed distribution

"hard test"
few scores fall at high end (tail on right/positive end w/lump on left)  mean, median, mode

mean: average, preferred measure of central tendency, sensitive to extremes"pulled toward tail"
median: Md, middle value when ordered from low to high, not as effected by extreme scores, useful for skewed distributions
Mode: most frequent value, can be more than one (multimodal, bimodal)  variance

measure of variability of disribution
average of squared differences of each score from mean
equal to the square of the standard deviation (s2)  standard deviation

square root of variance (s)
expected deviation from mean of a score chosen at random  zscore (standard score)

how many standard deviations a given raw score is from the mean
zscore distributions have sd of 1 and mean of 0  linear transformation
 when transformation of scores does not change distribution shape, i.e. raw scores to zscores
 tscores

mean of 50 and sd of 10
zscore of +1 equals tscore of 60  percentile score vs. rank
 score referenced to items on test (70% correct), rank referenced to other scores in distribution (70th percentile70% scored below you)
 what is the distribution of percentile ranks?
 flat (rectangular) distribution
 nonlinear transformation
 converting scores will change shape of distribution, i.e. raw scores to percentile ranks
 standard deviation curve stats (for normal distribution)

68% fall between +1z or sd,
95% fall between +2z or sd,
+1z or sd equivalent to PR of 84/16 or top/bottom 16%,
+2z or sd equivalent to 98th PR/top 2%  sampling error

difference between sample mean and population mean (one type)
statistic (sample value) vs. parameter (population value)  standard error of the mean

expected difference between sample mean and population mean,
s.d/square root of N,
inverse relationship bt sample size and std. error of mean  twotailed vs. onetailed hypothesis

two tailed states a mean is different from another mean but do not know in which direcction
one tailed states mean is either > or < another mean  Type I error

found difference when there isn't one
probability of making type I error is alpha,
level set by researcher in advance  Type II error

found NO difference when there in fact IS one
probability of making type II error is beta (usually can't be determined)  power

probability of declaring there is a difference when one actually exists (rejecting false null hypothesis)
probability of NOT making TII error or 1beta  factors that affect power

1. sample size: larger > power
2. Alpha: higher > power (but > chance of making TI error)
3. one tailed > twotailed
4. > difference between pop means under study  parametric test

used for interval and ratio data
ttest and ANOVA
assumptions: normal distribution, homogeneity of variance, independence of observations (most imp.)  nonparametric test

used for nominal or ordinal data
chisquare, MannWhitney U
no assumptions about distributions
less powerful than parametric tests  similarity between parametric and nonparametric tests
 share assumption that data come from unbiased sample (random selection)
 ttest

compare two means (t for two)
one sample: sample mean to known pop mean (df=N1)
independent sample: means from two independent samples (df=N (total # subj in study) 2)
correlated samples: means of two correlated samples (before/after) (df=N (# pairs)1)  Oneway ANOVA

one IV and 2+ groups/levels
statistic is F, ratio of between/within group variance
does not indicate which means are diff (posthoc tests)  Factorial ANOVA

2 or more IV and 1 DV
a.k.a. twoway (2IV) or threeway (3IV) ANOVA
assmt of both main and interaction effects  MANOVA

multiple DVs and at least one IV
adv of reducing TI error over separate ANOVAs  Chisquare

analyze nominal data
compares freq of nominal observations to freq expected under null hypo  MannWhitney U
 compare two independent groups on DV w/RANK ORDERED DATA (like ttest for independent samples)
 Wilcoxon MatchedPairs Test
 compare two correlated groups on a DV w/RANK ORDERED DATA (like ttest for correlated samples)
 KruskalWallis Test
 compare two or more independent groups on DV w/RANK ORDERED DATA (like oneway ANOVA)
 critical value

determine whether or not to reject null hypothesis (table)
 if obtained value exceeds critical value, reject null
value to use depends on preset alpha level and degrees of freedom for statistical test  F ratio (ANOVA)

comparison of betweengroup variance (tx variance) and withingroup variance (error variance)
desire between group variance to be large (effect of tx) and withingroup error to be small  ANOVA summary table

Sum of Squares: variability of set of data (between, within)
DF: between = k(# groups)  1, within = Nk
Mean Squares: Sum of Squares/DF (illustrates in table the fratio)  danger of multiple comparisons (in posthoc tests)

increase chance of making TI error
increases "experimentwise error rate"  which of the posthoc tests is the most conservative?

Scheffe
decreases TI but increases TII chance  which of posthoc tests is appropriate if only conducting pairwise comparisons?

Tukey
provides enough protection against TI if only pairwise comp are made  main effects vs. interaction effects

advantage of Factoral ANOVAs
Main: effect of one IV by itself
Interaction: effects of an IV at different levels of other IVs  marginal vs. cell means

Factorial ANOVA
examine main effects by examining difference in marginal means
examine interaction effects by examining differences in cell means (if move in opposite directions (cross) across both levels of IV then have interaction effect, if move in same direction i.e. both increase or both decrease (parallel) than no interaction effect  what is caution in interpretation when you find interaction effects?
 must interpret the main effects with caution, main effects don't generalize to all levels of other IVs
 name 3 cautions in using Chisquare

1. all observations must be independent
2. each observation can only be classified into one category/cell  must be mutually exclusive
3. %s of observations w/i categories cannot be compared (must convert to #'s)  correlation coefficient

ranges 1.00 and +1.00
describes magnitude (absolute value) and direction (or+)
vary directly (+) or inversely ()  relationship between correlation and causality
 correlation is a necessary but not sufficient condition of causality, correlation does not guarantee causality but if causal link is established then they must be correlated
 Pearson r (PPM)

calculates the relationship between two variables
most commonly used correlation coefficient in psychology  What factors affect the Pearson r?

1. linearity: assumes linear rel (not curvilinear)
2. homoscedasticity: assumes = dispersion of scores (not heteroscedasticity)
3. range of scores: wider range will yeild more accurate correlation  coefficient of determination

squared correlation coefficient
indicates the percentage of variability in one measure accounted for by variability in other measure
if IQ & GPA correlate at .70 than 49% of variation in GPA can be explained by variation in IQ (rest is explained by unmeasured factors)  pointbiserial and biserial coefficients

pointbiserial: relates one continuous var and one dichotomous var (gender)
biserial: two continuous var are correlated but one is artifically made dichotomous (high/low)  Phi and Tetrachoric coefficients

Phi: when both variables are dichotomous
Tetrachoric: both variables artifically dichotomized  contingency
 correlation between two nominal variables each having 2+ categories (ex. father's eye color and son's eye color)
 Spearman's Rho (rankorder corelation)
 correlate two variables ordinally ranked (compare two judges rankings on same set of observations)
 eta
 measures nonlinear relationships
 regression
 when two variables are correlated, allows you to estimate the value of one variable based on value of other
 predictor vs. criterion in regression equation
 predictor is value given and criterion is the predictee or value you are determining
 regression analysis can be used as a substitute for what?
 oneway ANOVA
 multiple correlation coefficient (Multiple R)
 assesses relationship between two+ predictor var and ONE criterion var
 multiple regression
 use of scores on more than one predictor to estimate scores on a criterion
 the multiple correlation coefficient is has highest predictive power when?
 predictor variables are highly correlated with criterion but not each other (multicollinearity)
 multiple correlation coefficient is never lower than what?
 the highest simple correlation bt an ind predictor and the criterion
 The multiple correlation coefficent can never be what?
 negative
 coefficient of multiple determination

R squared
like the pearson r, this notes the proportion of variance in the criterion variable accounted for by the combo of predictor variables  stepwise multiple regression

forward and backward
with each addition of predictor variable determine if predictive power of multiple R has increased  canonical correlation
 used with multiple criterion and multiple predictor variables
 discriminant function analysis
 used to predict criterion group membership, not a criterion score (like multiple regression)
 differential validity
 when each predictor has different correlation with each criterion variable
 logistic regression

used when required assumptions for discriminant analysis are not met (ex. normal dist, homogeneity)
predictors can be nominal
primarily used w/dichotomous DVs or when subj can be classified into one of two criterion groups  multiple cutoff
 identifying different cutoff scores on a series of predictors, must score at or above the cutoff on EACH predictor to be predicted as successful on criterion
 partial correlation
 statistically taking out or "partialling out" effect of a variable to control its effect on correlation
 path analysis

structural equation modeling technique
verify simpler causal models that propose oneway causal flows between variables
observed variables only  LISREL

structural equation modeling technique
oneway and/or twoway causal relationships
latent and observed variables