Glossary of Statistics Terminology
Other Decks By This User
- are numerical & can be ordered or ranked. (They all have a certain quantity – age, weight, etc.)
- variables that can be placed into distinct categories according to some character or attribute. (They all have a certain quality – all male)
- ranks data and precise differences between units of measure do exist; however, there is no meaningful zero (ex – temperature – no stopping point)
- possesses all characteristics of interval measurement, and there exists a true zero. In addition, true ratio exists when the same variable is measured on two different members of the population.
- classifies data into mutually exclusive (non-overlapping) exhausting categories in which no order or ranking can be imposed on data. (Ones no more important than the other – names of different characters)
- Classifies data into categories that can be ranked; however, precise differences between ranks do not exist. (Small, med, large drinks don’t have to be the same size.)
- Divide the population into groups according to some important characteristics then sample each group randomly.
- Randomly assign numbers to subjects and then choose every “nth” subject.
- Start with intact groups to represent the population & then randomly select a few of these groups where all subjects that are members of the selected groups will be involved in the study.
- Selected by randomly assigning numbers to subjects & using chance or random methods to choose subjects.
- collection, organization, summarization, & presentation of data
- making generalizations from samples to populations, performing hypothesis testing, determining relationships among variables, & making predictions.
- a measure obtained by using the data values of a sample.
- A measure obtained by using all the data values for a specific population.
- the sum of the values, divided by the total number of values. The symbol X represents the sample mean.
- is the midpoint of the data array. The symbol for median is MD.
- the value that occurs most often in a data set. No symbol.
- Highest value & subtract the lowest value. Symbol R is used for range.
- the average of the squares of the distance each value is from the mean.
- Standard Deviation
- the square root of the variance.
- Symettrical (normal curve)
- – it is symmetric because it’s a bell shape the highest point is in the center. It’s evenly distrusted about the mean.
- Characteristics of the normal curve
- 1.The normal distribution curve is bell-shaped
2.The mean, median, & mode are equal and located at the center of the distribution.
3.The normal distribution curve is unimodal (It has only one mode).
4.The curve is symmetric about the mean; shape is the same on both sides of the center.
5.The curve is continuous, that is, there are no gaps or holes.
6.The curve never touches the x axis. Theoretically it only gets increasingly closer.
7.The total area under the normal distr. is equal to 1 or 100%
8.The area under the part of the normal curve that lies within 1 standard deviation is approximately 68%, within 2 standard deviations is 95%, and 3 is 99.7%.
- Probability Rules
- 1.) The probability of any event E is a number (either a fraction or decimal) between and including 0 and 1. Denoted by 0 ≤ P (E) ≤ 1
2.) If an event E cannot occur (i.e. the event contains no members in the sample space), its probability is 0. (ex if you roll a die there is a 0 probability you’ll get a 9)
3.) If an event E is certain, then the probability of E is 1.
4.) The sum of the possibilities of the outcomes in the sample space is 1.
- Probability Experiment
- is a chance process that leads to well-defined results called outcomes.
- is the set of outcomes of a probability experiment.
- The result of a single trial.
- is an arrangement of n objects in a specific order.
- are used when the order or arrangement is not important as in the selecting process. (Ex - pick a committee of 5 students)
- Counting Rule
- In the sequence of n events in which the first one has k1 possibilities and the second even has k2 and the third has k3 and so forth, the total number of possibilities of the sequence will be
k1 ∙ k2 ∙ k3 = kn
- Hypothesis Testing
- a decision-making process for evaluating claims about a population.
- How to set up null and alternates
- Null (H0) no difference Alternate (H1) is a difference.
H0: M = 25
H1: M ≠ 25 or you can use >, <, ≤, ≥ later two only go on the null! When the problem states less than then the < would go with the alternate. Also need to put a (claim) by whichever we want to find true.
- Type I Error
- no change in the population, but change in the sample.
- Critical Region
- is the range of values of the test value that indicate that there is a significant difference and that the null should be rejected.
- A statistical method used to determine whether a relationship between variables exist.
- A statistical method used to describe the nature of the relationship between variables.
- Y = a + bx what does A and B represent?
- a is the y-intercept and B is the slope
- Correlation Coefficient
- computed from the sample data measures the strength and direction of a linear relationship between two variables. Symbol for sample is r. For the population is ρ.
- Positive Correlation/relationship
- both variables increase and decrease at the same time
- Negative correlation/relationship
- as one variable increases, the other decreases.
- Dependent Variable
- The resultant variable.
- Independent variable
- controlled or manipulated.
You must Login or Register to add cards