This site is 100% ad supported. Please add an exception to adblock for this site.

# Design and Statistics 2

## Terms

undefined, object
copy deck
What does the validity of a test refer to?
Validity of a test refers to the extent to which the test measures what it was designed to measure.
What is internal validity?
Internal validity is extent to which the intervention or manipulation (manipulation of IV) accounts for changes in DV
What are the most common risks to internal validity?
1. History (event that occurs in the experiment or outside of the experiment-other than manipulation of IV- that could account for change in DV)
2. Maturation (development/decay of biological or psychological factors)
3. Testing effect (taking test more than once...e.g., pretest effect)
4. Instrumentation (change in instrument/measuring procdures during course of experiment)
5. Regression to the mean (tendency of extreme scores to regress to mean)
6. Selection bias (systematic differences in groups based on selection of subjects into groups)
7. Attrition/Mortality (differential drop-outs btwn experimental/control group)
8. Experimentor bias (reaction of experimentor may differ between groups, experimentor's procedures in study may vary over time).
What is the Rosenthal effect/Pygmalion effect?
The tendency for participants' performance to be effected by the expectations of the tester (students perform better than other students because the teacher expects them to)
How do researchers safeguard internal validity?
1. Random assignment
2. Matching subjects on possibly relevant characteristics (less powerful than random assignment, but necessary when groups cannot be randomly assigned)
3. Blocking (study the effects of extraneous subject characteristics...treat them as IV)
4. ANCOVA (analysis of co-variance...statistical procedure developed to "account for" group differences in extraneous characteristics)
What is external validity?
Refers to the generalizability of study/test results. It refers to the limits or boundaries of the findings.
What are threats to external validity?
1. Interactions between subject selection and treatment (treatment effects do not generalize to other members of the population)
2. Testing and treatment effects (treatment effects do not generalize to individuals who did not participate in the pre-testing...e.g., from demand characteristics)
3. History and treatment effects (treatment effects depend on history of testing period)
4. Demand characteristics (cues in research the alert subjects to how they should respond)
5. Hawthorne effect (tendency of subjects to respond differently when they are observed in research setting)
6. Order effects (risk in repeated-measure studies)
What is the Hawthorne Effect?
Tendency of subjects to behave differently when they are in a research study
How do resarchers safeguard external validity?
1. Random selection from population of interest
2. Conduct naturalistic/field research
3. Use single or double-blind research designs
4. Counterbalancing (e.g., vary the order of treatment strategies among participants to eliminate order effects)
5. Stratified random sampling (select random sample from each of several subgroups of target population)
6. Cluster sampling (the unit of sampling is naturally occurring groups of individuals rather than sampling on an individual level)
What is Content Validity?
The degree to which items on a test represent the domain that the test is supposed to measure (e.g., does a test of depression include items that measure vegetative sx, cognitive sx, mood sx, etc.)
What is Face Validity?
The extent to which a test appears to measure what it states it measures. Face validity can be misinterpreted as content validity.
What is Criterion-Related Validity?
Refers to the relationship between test items and a criterion of interest (correlation of a measure with a criterion of interest...correlation of performance on an academic test with grades).
2 types of Criterion Validity: Concurrent Validity, Predictive Validity
What is Concurrent Validity?
One type of criterion-related validity. Refers to correlation of performance on two measures at the same point in time (e.g., correlation of performance on an achievement test with current grades).
What is Predictive Validity?
A type of criterion-validity. Refers to the relation between test scores and future performance on a criterion of interest.
What is Construct Validity?
There are 2 ways to think about Construct Validity:
1. Construct validity of an experiment: If a causal relationship has been determined between the IV and DV (i.e., internal validity has been supported), construct validity refers to the experimentor's explanation for why the causal relationship exists (i.e., what aspect of the intervention was the causal agent).
2. More common use of construct validity is in the context of test development. It refers to the extent to which a measure assesses the psychological construct or trait it was designed to measure. In this context there are 2 types of Construct Validity: convergent validity, divergent validity
What is Convergent Validity?
A type of construct validity. The extent to which two measures assess similar constructs. Construct validity is supported when measures of the same contruct are correlated. It also refers to the similarity among measures of the same construct that are administered in different formats (e.g., self and parent report of symptoms; essay response vs. multiple choice).
What is Divergent Validity?
Type of construct validity, also known as Discriminant Validity. Correlation between measures of different constructs. Construct validity is supported by low correlations between measures of different constructs.
How do Convergent and Divergent validity relate to construct validity?
Construct validity is supported when a measure correlates highly with other measures of the same construct (convergent validity) and does not correlate with measures of different constructs (divergent validity).
How can construct validity of a test be established?
1. Establish a relationship between performance on the test and the theoretical construct it was designed to measure (e.g., examine concurrent validity and predictive validity).
2. Conducte a factor analysis of test items and assess whether factors relate to constructs of interest.
3. Perform factor analysis of items on test with items from established tests with construct validity
4. Establish that performance on a test rates more highly with tests of similar constructs (convergent validity) then tests of different constructs (divergent validity)
What is reliability?
Reliability is the consistency of a measure. It refers to the extent to which random, unsystematic factors affect the measurement of a construct/trait.
Name 4 types of reliability.
1. Internal reliability (consistency among items in a test)
2. Test-retest reliability
3. Alternate form reliability
4. Interrater reliability
How are validity and reliability related?
A test can be reliable and not valid, but not the other way around. Reliability is necessary but not sufficient for validity.
What are the 2 components of a test score according to classic psychometric theory?
1. Test score
2. Error score
Test score refers to all systematic information that contributes to the score. Error refers to all the random "noise" in the score. Note: true score does not refer to the underlying construct of interest...only to systematic variation in the score.
How can the "true score" be identified?
True score is a hypothetical construct and cannot be directly observed. It is best estimated as the mean of repeated measures on the same test (each score includes error and the true score, so the true score is the average of these repeated measures). It is also assumed that the distribution of scores is normal.
What types of variance contribute to the reliability ratio?
The ratio of true score variance to observed score variance.
Remember that the measurement theory proposes that each test score consists of the "true score", reflecting stable characteristics of the participant, and "error", reflecting random/unpredictable variation in score.
What is the reliability coefficient?
This coefficient, often designated as r, ranges from 0 (indicating no relationship between 2 measures) to 1.00 (indicating perfect relationship between 2 measures)
r is used to reflect all 4 types of reliability (internal reliability, test-retest reliability, alternate form reliability, interrater reliability)
What measure is used for test-retest reliability and alternate form reliability?
Pearson's product moment correlation
What is the most common measure for internal test reliability?
Cronbach's alpha...can be used for test in which items have range of responses (not dichotomous)
What measure is used to evalute the effect of lengthening or shortening a test?
Spearman-Brown correction formula
What does the Kuder-Richardson formula refer to?
Assess reliability of a test with dichotomous responses (0 or 1)
What are acceptable scores of reliability?
.80 and above is good reliability.
.70-.79 is acceptable reliability (depending on the purpose of the test)
.60-.69 marginally reliable
.59 and below not reliable
What does split-half reliability refer to?
Internal reliability. Split test items in half and compare consistency of responses using Spearman-Brown.
What is the most common measure of interrater reliability?
Kappa
How does test length affect reliability?
In general, more test items increase reliability
How does range of scores affect reliability?
The greater the range, the better the reliability.
What is Standard Error of Measurement (SEM)?
Estimate of amount of error associated with an obtained score (remember, obtained score = true score + error). In computational form, SEM refers to the standard deviation of the distribution of error scores.
How does SEM relate to reliability?
Inversely related: the larger SEM, the less reliable the test is.
What does "standard deviation of distribution of error scores" (aka. SEM) mean?
Psychometric theory proposes that "true" score can be estimated as the mean of a distribution of obtained score. Each obtained score includes the true score and error, and the distribution of these scores falls in the "normal distribution" or "bell curve". If one were to calculate the error scores by subtracting the hypothetical true score from each obtained score, and plot those error scores, they also would fall in a normal distribution. The standard deviation of that distribution of error scores is the "standard deviation of distribution of error scores", or SEM.
What is a confidence interval?
Range of scores around the obtained score that likely include the "true score".
What is a "true experimental design"?
Experiment in which subjects are randomly assigned to treatment and control groups
What are the benefits of a "true experimental design"?
1. Greater experimental control
2. Greatest protection of internal validity (i.e., support for a "causal" relationship between IV and DV
What is the name of an experimental design in which subjects are not randomized into a group, but the IV is manipulated (e.g., effects of alcohol on men's and women's reactions to violent films)
Quasi-experimental design...in this example, the level of alcohol is manipulated (IV) and DV (reactions to voilent films) is measured. Gender is also an IV of interest, but cannot be manipulated/randomly assigned
What is a "correlational design"?
Variables are not manipulated, so no causal relationship can be assumed. Allows for study of relationships among naturally occuring factors (e.g., relationship of white matter density to performance on cognitive test)
What are 3 study designs to assess developmental changes?
1. Longitudinal
2. Cross-sectional (select subjects of different ages and study them at the same time)
3. Cross-sequential (different age groups studied across over a short period of time) Combines the benefits of cross-sectional and longitudinal
What are the pros and cons of longitudinal research
Pros: subject is his/her own control
Cons: cost, time intensive, attrition, practice-effects
Tends to under-estimate true age-related changes.
What are the pros and cons of cross-sectional research?
Pros: less costly, provides results faster
Cons: cohort effects, differences could be due to experience rather than age. Tends to over-estimate true age-related changes.
What is a Time Series Design?
Dependent variable is measured multiple times, at regular intervals, before and after a treatment is administered
What are the pros and cons of Time Series Design?
Pros: person is own control, controls for maturation, regression to the mean, and testing effects
Cons: history effect is a threat to internal validity
What is Single-Subject Design?
Research involves one subject (or groups of subjects). Baseline measurements are taken and then the intervention is asministered. DV is measured several times at baseline, during administration, after.
What does ABAB stand for?
Reversal, single-subject design. This study design offers multiple baseline (A) and treatment (B) conditions so can measure effects of treatment and treatment withdrawal.
What are some uses of qualitative/descriptive research?
1. Develop theories of relationships among variables.
2. Used for pilot studies to better understand IVs
3. Used with observation, interviews, surveys, case studies
Name the 4 scales of measurement
1. nominal ("names" of categories, unordered)
2. Ordinal ("order"/rank data; Likert scale)
3. Interval (no absolute "0" so cannot form ratio...IQ)
4. Ratio: has absolute 0..weight, time)
What are parametric statistics?
Statistics analyzing interval and ratio data.
What are the assumptions of parametric statistics?
1. Normal distribution
2. Homogeneity of variance (variance equal among all groups)
3. Independence of observations (one data point is not dependent on another data point)
Note: parametric stats are somewhat robust to violations of normal distribution and homogeneity of variance. They are not robust to multicoliniarity (i.e., non-independent measures)
What parametric statitic is used to compare two group means?
Students t-test.
What are the 3 types of t-test?
1. one sample t-test. Compares a sample mean to a known population mean. df = N-1
2. t-test for independent samples. Compares means of 2 independent groups df = N-2.
3. t-test for correlated samples. Compares means of 2 correlated samples (e.g., matched sample tests, pretest/posttest measures) df = N-1 (where N = number of pairs of scores)
What is degrees of freedom?
The number of independent observations in the sample minus the number of parameters estimated in the formula.
Conceptually, in any sample of N observations, there is a value X (mean of sample) that is known. Relative to that known mean, there is only N-1 raw scores that are "free to vary".
Evans, 1996
When would a researcher use a one-way ANOVA analysis?
One-way ANOVAs are used for one dependent variable and more than 2 groups (a 2 group one-way ANOVA is a t-test)
Explain F ratio in a one-way ANOVA
One-way ANOVAs compare between group variance to within-group variance. The F statistic is a ratio of between group to within group variance.
Explain between group variance and within group variance in ANOVA.
Between group variance is the "desirable" variance in that it reflects the "treatment effect". Within group variance is similar to "noise". It is undesirable variance b/c it decreases the power of the study to detect between group differences.
What is sum of squares? What is mean square?
Sum of squares is a measure of variability in a set of data.
Sum of squares Between represents between group variability. df Between = k-1 (k=number of groups).
Sum of squares Within represents within group variability. df within = N-k (N=total number of subjects)
Mean square = SS divided by df.
How does mean square relate to F ratio?
F = MS between divided by MS within.
What does a significant F ratio indicate?
Signficant F ratio means that there is(are) significant difference(s) among group means. It does not state where the significant differences lie.
What is the purpose of post-hoc tests?
Post-hoc comparisons are used to compare pairs of group means to identify where differences lie when a significant F ratio is obtained. (e.g., Tukey, Scheffe, Neuman-Keuls, Duncan's multiple range test, Fischer's Least Signficant Differences Test)
What is the risk associated with post-hoc comparisons?
Increase Type 1 error (the risk of finding a false significant difference)
What is Type I and Type II error?
Type I error (alpha) is the likelihood of finding a significant difference when one does not exist (e.g., alpha = .05 means there is a 5% likelihood that a significant difference is due to chance).
Type II error (beta) is the likelihood of not finding a significant difference when one actually does exist.
What is power?
The probability of rejecting a false null. Calculated as 1-beta.
How are type I and type II error related?
They are inversely related (so an increase in risk of Type I error is related to decrease in risk of Type II error)
What is a factorial ANOVA?
Study that allows for 2 or more IVs and one DV
Allows for examination of main effects and interactions
What are main effects and interactions?
Main effects are the direct effects of IV on DV. Interactions refer to when the effects of one IV on the DV is dependent on the value of the other IV. Signficant interactions should be interpreted before main effects.
What does moderator refer to?
Moderator is the same thing as interaction. If variable B is said to moderate the relationship between A and C (where A and B are IVs and C is DV), this means that there is a significant interaction between A and B. That is, the relationship between A and C depends on the value of B.
What does mediator refer to?
A mediator accounts for (or partially accounts for) a relationship between an IV and DV. So if A and B are IV and C is DV, B is a mediator if it accounts for (or partially accounts for) the relationship between A and C. In this model, unlike with moderators where there is a significant interaction, the relationship between A and C decreases or is eliminated when B is included in the model. Furthermore, to support mediation, there must be a significant relationship between A and B and B and C.
What is MANOVA?
Multiple analysis of variation.
Used in studies with more than one DV and one or more IVs
What is the benefit of a MANOVA compared to multiple ANOVAs?
Could conduct multiple ANOVAs for each DV, but this increases the risk of Type I error.
What is ANCOVA or MANCOVA?
Analysis of covariance/multiple analysis of covariance.
Covariate is a variable that is not randomly distributed between the 2 groups and there is concern that it will contribute to treatment effects. These tests allow for "statistical controlling" of the covariate by including it in the model as a IV. The idea is that if the relationship between IV and DV remains significant when the covariate is in the model, then it does not account for the treatment effect.
What are non-parametric statistics?
Tests for nominal or ordinal data. They are "distribution free" tests and less powerful than parametric tests. For that reason, researchers attempt to use interval and ratio data when possible to increase power.
What is Chi-Square?
Used with nominal data to cmpare frequencies of observations.
What is the hypothesis with Chi-square?
The null hypothesis is that observed frequencies are randomly distributed. The alternate hypothesis is that the observed frequencies are related to treatment effect (or IV).
What data characterists are required for Chi-square?
1. independent observations
2. observations can only be classified into one cell
3. can only compare frequencies...not percentages
What is the Mann-Whitney U test?
Compares 2 independent groups on a DV that is measured with rank-ordered data.
Comparable to the t-test but used with ordinal data.
Can convert interval and ratio data to ordinal rank and use Mann-Whitney rather than t-test in cases where assumptions of parametric tests are not met.
What is the Wilcoxon Matched Pairs Test?
Compare 2 correlated groups on DV that is measured with rank-ordered data.
Alternative to t-test for correlated samples when data is nonparametric
What is Kruskal-Wallis Test?
Compares 2 or more independent groups on a DV with rank-ordered data.
Alternative to one-way ANOVA with nonparametric data
What are the 2 uses of statistics?
1. Descriptive purposes
2. Inferential purposes
What is descriptive statistics
Numbers created to describe a sample (e.g., measures of central tendency, measures of sample variance). Purpose is to describe a data set.
What is inferential statisitics?
Statistics calculated on a sample of a population and used to infer information about the population from the sample (e.g., sample mean, standard deviation). Purpose to make inferences about a population from sample data.
What is the normal distribution?
Normal distribution is the "bell curve". A unimodal distribution in which the mean, median and mode are the same.
What is positively skewed distribution?
Positively skewed distribution is when majority of scores are low (to the left) with a few extreme high scores. Mode<median<mean. This is the distribution of a difficult test where most students did poorly. Also indicates floor effects.
What is negatively skewed distribution?
Majority of scores are high with a few low scores. mean<median<mode. Distribution of scores on an easy test in which most students did well. Also indicates ceiling effects.
What are z-scores?
Standardized scores with mean = 0 and SD = 1. Distribution of z-scores is the same as distribution of raw scores from which the z-scores were calculated.
What are the characteristics of a normal distribution?
68% of scores fall between +-1 SD.
95% of scores between +-2 SD.
99.9% of scores betwn +- 3 SD
What are the mean, median and mode?
Mean is average of scores (sum of scores/number of scores). Highly sensitive to extreme scores.
Mode is most frequent score. Can be more than one mode. Not as easily influenced by extreme scores as mean.
Median is the score at which 50% of scores are below. Not as sensitive to extreme scores as mean.
What is standard deviation
Measure of variability of sample/population. Calculated as square root of variance.
What is variance?
Average of squared differences of each score from the mean.
What is the difference between percentage and percentile?
Percentage refers to items on a test. Percentile refers to rank of score given other scores in distribution.
Percentile rank scores have a rectangular distribution in that there is an equal number of each score.
What is the sampling distribution of means?
Distribution of infinite number of sample means (rather than distribution of raw scores)where each mean is computed from the same-sized samples drawn with replacement from the population.
Name two factors related to variability in sampling distribution of sample means.
1. has less variability than population distribution
2. Standard deviation of the distibution is called standard error of the mean
What is the Central Limit Theorem?
1. As sample size increases, shape of sampling distribution of sample means approximates a normal distribution
2. Mean of sampling distribution of sample means = mean of population.
What is a statistic vs. parameter?
Statistic is calculated on sample. Parameter is measure from population.
What is the standard error of the mean?
Standard error of sampling distribution of sample means. Provides a measure of inaccuracy of sample means by estimating the difference between sample and population means.
How do z-scores and percentile rank relate?
.1% is z=-3
2% is z=-2
16% -s z=-1
50% is z=0
84% is z=1
98% is z=2
99.9% is z=3
What are the null hypothesis and alternative hypothesis?
Null hypothesis states that there is no relationship between the IV and DV. The alternate hypothesis states that the IV has an effect on DV.
What are the 2 correct findings than can occur in a study (as they relate to null/alternate hypothesis)?
1. True null hypothesis is retained (i.e., the study fails to find a relationship between IV and DV, and there is no relationship in the population).
2. False null hypothesis rejected (i.e., the study finds a relationship between IV and DV, thereby rejecting the null, and there is a relationship in the population).
What are the 2 incorrect findings that can occur in a study (as they relate to null/alternate hypothesis)?
1. True null rejected (i.e., the study finds a relationship between IV and DV, thereby rejecting the null, but in the population there is no relationship). This is type I error (alpha).
2. False null retained (i.e., the study fails to find a relationship between the IV and DV, but there is a relationship in the population). This is Type II error (beta).
What factors increase power?
1. increased sample size
2. increased risk of Type I error
3. one-tailed test
4. greater differences between the population means
What are one-tailed and two-tailed hypotheses?
One-tailed hypothesis states the expected direction of the difference between means (e.g., states that one population mean is greater than the other)
Two-tailed hypothesis does not state the direction of differences.
How does a researcher decide whether to accept or reject the null hypothesis?
The statistic is compared to a critical value. If the statistic is lower than the critical value, the null is retained. If the statistic is greater than the critical value, the null is rejected.
What determines the critical value to which a statistic will be compared?
1. A pre-determined alpha level (i.e., level of acceptable risk of type I error)
2. degrees of freedom
What is a correlation coefficient?
A statistic that represents the degree of relationship between two variables. Correlation coefficient states strength and direction of relationship.
The relationship is pictorally represented by a scattergram
What is the range of values for correlation coefficient?
-1 to 1 with 0 = no relationship.
What is Pearson's Product Moment Correlation?
Statistic that represents the relationship between 2 continuous variables.
What factors affect Person's Product Moment Correlation?
1. Linearity (assumes linear relationship between 2 variables)
2. Homoscedasticity (scores are equally distributed)
3. Range of scores (wider range of sample scores provides more accurate estimate of relationship)
What is the square of Pearson's R?
Coefficient of determination. Percentage of variability in one measure that is accounted for by variability in the other measure.
What statistic measures non-linear relationships between 2 variables?
Eta
What is point-biserial coefficient?
Correlation between one continuous variable with one dichotomous variable.
What is phi coefficient?
Correlation between 2 dichotomous variables.
What is Spearman's rho?
Correlation between 2 rank-ordered variables.
What is a contingency coefficient?
Correlation between 2 nominally scaled variables.
What is eta coefficient?
Correlation between 2 continuous variables whose relationship is non-linear.
What is regression?
Equation set to estimate the value of a criterion (DV) based on values of the predictor(s)(IV). Used when the IV and DV are correlated.
What is multiple regression?
Regression equation that includes more than one predictor and one criterion.
What do X and Y refer to in regression?
X = predictor = IV
Y = criterion = DV
What are the assumptions of regression?
1. Linear relationship between X and Y
2. Homoscedasticity (error scores of criterion (diff. between estimated and actual y) are the same across range of x
3. Homogeneity of variance (variance in criterion is the same across the range of X)
4. Normal distribution of error scores
What is an error score in regression?
Difference between the predicted criterion and the actual criterion. Error scores are assumed to be normally distributed with a mean = 0
What is the assumption of homoscedasticity?
Assumption that there is no relationship between error scores and the acutal criterion scores.
What is a regression line and how does it relate to error scores?
Regression line is the line that "best fits" the relationship between the IV and DV. Its location is determined by the least squares criterion.
What is the least squares criterion?
The location of the regression line where there is the least amount of error in prediting Y scores from X scores.
What is the multiple correlation coefficient?
Relationship between 2 or more IVs with one DV. It is the predictive power of a multiple regression equation.
What factors influence multiple regression coefficient?
Highest when the predictors each have a high correlation with the criterion but low correlations with each other.
What is multicollinearity?
The degree to which predictors correlate with one and other. Decreases the accuracy of the regression equation and does not add predictive power to the equation.
How does the multiple correlation coefficient relate to the simple correlation coefficients between the predictors and criterion?
Multiple R should never be lower than the single correlations beween predictors and criterion. However, it is possible that adding predictors can decrease MR because of multicollinearity.
What is multiple R squared?
Coefficient of multiple determination (porportion of variance in criterion accounted for by the predictors)
What is step-wise regression?
Predictors are added in a step-wise fashion. Goal is to identify the smallest number of predictors that account for the greatest amount of variability in the criterion.
What is canonical correlation?
Calculates the relationship between 2 or more predictors and 2 or more criterion.
What is partial correlation?
Used to assess the relationship between a predictor and criterion with the effects of another predictor "partialled out".
What is zero-order correlation?
Correlation between 2 variables without accounting for the effects of any other variables.
What is a supressor variable in regression/corrlation?
When a variable supresses a relationship between 2 variables.
What is Discriminant Function Analysis?
Used to classify individuals into groups based on their scores on multiple predictors.
What is a multiple cutoff score?
Setting a minimal cutoff score on a series of predictors such that an individual will not be included in a group if even one of his/her predictor scores is below the cutoff.
What is Structural Equation Modeling?
SEM refers to techniques used to estimate causal relationships between multiple variables. These strategies calculate pairwise correlations between multiple variables, and the alternate hypothesis posits a causal relationship among multiple variables (e.g., LISREL, path analysis)
What are the assumptions in SEM?
1. Linear relationship among variables
2. Path analysis assumes one-way causal flow and includes observed variables only
2.LISREL allows for one or two-way relationships and is used when the model specifies observed and latent variables.
What is trend analysis?
Statistical techniques used to estimate a trend of change (e.g., linear, quadratic, cubic, quartic) in the DV (rather than testing the assumption that the DV does not change vs. does change).
What is logistic regression?
Used for dichotomous dependent variables. Similar to discriminant function analysis in that it determines the likelihood of an individual belonging to a group.
What are the differences between discriminant function analysis and logistic regression?
Discriminant Function Analysis: 1.assumes multivariate normal distribution, 2.homogeneity of variance/covariance 3. uses continuous variables only
Logistic regression: can use nominal or continuous variables
What is autocorrelation?
Correlation between 2 observations at different time lags. Used in time-series analyses.
What is Bayes' theorem?
Formula for obtaining conditional probabilities. Used to revise conditional probabilities based on new information.
How is the standard error of the mean related to confidence interval?
CI = sample mean +/- confidence factor x SEM. Multiple SEM buy confidence factor selected by you and add/subtract to sample mean. This gives the boundaries for the CI.
What is the equation for a z-score and T-score?
z = (x-mean x)/stand deviation
T = 10z + 50
What is the equation for F, MS, and SS?
F = MSbetween/MSwithin
MSbetween = SSbetween/dfbetween
MSwithin = SSwithin/dfwithin
SSbetween = sum[N(groupmeanX-grandX)squared]
SSwithin = sum[sum(x-meanX)squared)]
What is Effect Size?
Magnitude of the difference among groups. ES = (mean1-mean2)/pooled standard deviation

Effect size is a numerical representation of differences between 2 groups that uses a common metric. This metric can be calculated for different studies, allowing for a comparison of group differences among numerous studies. Frequently used in meta-analyses.
What are the values of small, medium and large effect size?
.2, .5, .8
What is heteroscedasticity?
Heteroscedasticity refers to when the standard error of scores varies through the range of scores (e.g., more variability on low scores than high scores in a distribution).
What is factor analysis?
FA is a statistical procedure that is used to asses relationships among test items. It allows for item reduction and identification of underlying constructs.
What is confirmatory factor analysis?
A test of hypothesized relationships among items (i.e., the researcher hypothesizes a factor structure a-priori and confirmatory factor analysis tests this hypothesis)
What is exploratory factor analysis?
When no relationships are proposed among items the FA correlates all items with each other and identifies relationships among the items.
What are principle components?
Linear combinations of variables that account for variance among items. Typically the researcher selects components with eigenvalues above 1.
What is orthogonal rotation?
Rotation of variance matrix such that components/factors are not correlated.
What is oblique rotation?
Rotation of variance that allows factors to be correlated.

157