Reliability Validity Levels of Measurement Scales How we figure out what to measure Conceptualization Process of taking a construct and refining it by giving it a conceptual or theoretical definition Research focusing on college students In Ohio? What region? Age? Major?

Operationalization Links a conceptual definition to a specific set of measurement techniques Coming up with a measure

Remember the conceptual definition Keep an open mind Borrow from others Anticipate difficulties Dont forget units of analysis Empirical Hypothesis The degree of association How well

operationalized variables are associated (or not) with the concept construct determines the hypothesis Reliability Reliability means dependability or consistency Same thing occurs over and over under same conditions

A scale, for example How dependable is the study? Is the study consistent, or does it yield wide varying results? Can the study be replicated? Reliability Measurement directly affects the quality of conclusions. Care is needed to make sure that results are not corrupted by improper measurement.

The operational definition of a concept should have a precise meaning: The terms by which you measure a concept should be explicit. Reliability Reliability and validity are the biggest threats to proper measurement. Reliability is the extent to which an experiment, test, or any measuring procedure yields the same results on repeated trials.

Do you get the same result every time? Reliability Three tests of reliability: Test-retest method Applying the same test to the same observations after a period of time and then comparing the results of the different measurements Alternative form method Two different measures of the same concept

administered to the same respondents at different times before the scores are compared Split-halves method Divide a multi-item measure into two measures with both of the new measures applied at the same time Improving Reliability Clearly conceptualize constructs Increase level of measurement Use multiple indicators of a variable

Triangulation Use pretests, pilot studies, and replication Validity A valid measure is one that measures what it is supposed to measure, in other words, the degree of correspondence between the measure and the concept it is thought to measure. Four tests of validity

Validity Truthfulness Refers to the match between a construct and a measure Want it to be valid for a particular purpose and definition How good is the measure? Is the data measured correctly?

Is the data analyzed correctly (statistical)? Internal Validity Are there errors as a result of the internal design of the study? Are there errors as a result of the controls? Internal validity problems can occur from a flawed survey along with a multitude of other factors

External Validity Can your experiments findings be generalized? External Validity questions are evident in every study; however, methods exist to keep external validity high and the number of external flaws low Types of Validity Face validity

Judgment that the indicator really measures the construct Content validity Does your measure represent the full content of a defintion? Criterion validity Use some standard or criteria to indicate a construct accurately Concurrent validity Indicator must be associated with a preexisting indicator judged to be valid

Predictive validity Indicator predicts future events that are logically related to a construct Types of Validity

1) If I create a new test of mathematical ability for high school students and test it by having high school math teachers look at it and tell me if it seems appropriate, I am measuring for ____________________ validity.

2) If I am examining an individuals ability to cope with stress and have three attributes I am particularly interested in and I am checking to see if my construct hits on all three attributes, I am measuring for _______________________ validity. 3) If I create a new test for cognitive recognition and students that score high on it also score high on previously existing tests for cognitive recognition, I have demonstrated ____________________ validity. 4) If I compare my measure for testing the potential to suffer from childhood diabetes with a previously used test, I am looking for _________________________validity. 5) If I create a new test of intelligence and students that score high on it also do better in college than those who score lowly, I have shown ___________________ validity.

Validity Tests of validity are not as good as tests of reliability. Reliability is easy to demonstrate through some form of repeated trials. Validity is more difficult because we can never be sure about the true value of a concept: Especially true with abstract concepts Validity

Whereas a valid measure is reliable (because if truly valid, it will measure the concept correctly every time), a reliable measure is not necessarily valid. The measure could be measuring the concept incorrectly in a consistent way. Relationship between reliability and validity Levels of Measurement

The level of measurement of a variable describes The amount of precision associated with a variable The mathematical properties of the variable Both precision and mathematical properties increase as you increase the level of measurement from nominal to ratio. Levels of Measurement

Continuous v. discrete variables Continuous Have an infinite number of values or attributes that flow directly along a continuum Temperature, age, income, crime rate Discrete Relatively fixed set of separate values or attributes Gender, religion, marital status Nominal Level

Only reports a difference Candidate preference, religious preference, Yes/No, etc. Discrete Variables Levels of Measurement The level of measurement of a variable describes

The amount of precision associated with a variable The mathematical properties of the variable Both precision and mathematical properties increase as you increase the level of measurement from nominal to ratio. Ordinal Level Rank ordered Grades, opinion

Strongly Agree Agree Disagree Strongly Disagree

Levels of Measurement At the ordinal level, categories may be ranked in order in addition to indicating a difference between categories. Example: Please indicate the highest level of education you reached (elem., high, college, more). Precision: A little more precision and can be used with more statistical tools Interval/Ratio Level

A specified distance Interval does not contain a true zero point (ratio does) Interval: IQ, SAT Ratio: years of school, income Levels of Measurement The interval level includes all of the information of the preceding levels and

adds meaningful intervals between values of the variable but does not use a meaningful zero. Example: What did you score on the SAT? Precision: More precision and can be used with most statistical tools Levels of Measurement The ratio level adds a meaningful zero to the interval level. Example: How many years of education?

Precision: Most precision and can be used with most statistical tools Scales Some concepts can be captured with a single question. More complex concepts may require a multi-item measure consisting of several questions that capture different components of the concept and increase validity.

Scales Summation index: Combines the scores on multiple questions to create one single measure of a concept Likert scale: Uses only select questions from an index that differentiate between different respondents to create a single score for each respondent

Scales Guttman scale: Has answer choices arranged in an ordinal manner; respondents will agree with each of the lower-ranked answers if they agree with a higher-ranked answer Factor analysis: Allows researchers to uncover patterns across related measures to create summary variables that represent different dimensions of the same concept

Mutually Exclusive One and only. One may only fit the criteria of one category Ex: Religion: Christian, non-Christian, Jewish, Buddhist NOT MUTUALLY EXCLUSIVE Exhaustive All cases fit into one category Ex: If the Election were today, would you

support Sherrod Brown, the Democrat, or Mike DeWine, the Republican or neither? NOT EXHAUSTIVE Missing Data No survey is perfect, and certain questions will be left unanswered or completely skipped Remedy? A catch all category and/or a way to factor out missing data Yet, missing data can still mislead a study

One Last Thing Feeling Thermometers Likert Scales Response set problem How do we fix this???