VALIDITY AND RELIABILITY PAPER


CHAPTER ONE

VALIDITY

Validity refers to the degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure. While reliability is concerned with the accuracy of the actual measuring instrument or procedure, validity is concerned with the study's success at measuring what the researchers set out to measure.

Social science research differs from research in fields such as physics and chemistry for many reasons. One reason is that the things social science research are trying to measure are intangible, such as attitudes, behaviors, emotions, and personalities. Whereas in physics you can use a ruler to measure distance, and in chemistry you can use a graduated cylinder to measure volume, in social science research you cannot pour emotions into a graduated cylider or use a ruler to measure how big someone's attitude is (no puns intended).

As a result, social scientists have developed their own means of measuring such concepts as attitudes, behaviors, emotions, and personalities. Some of these techniques include surveys, interviews, assessments, ink blots, drawings, dream interpretations, and many more. A difficulty in using any method to measure a phenomenon of social science is that you never know for certain whether you are measuring what you want to measure.

Validity is an element of social science research which addresses the issues of whether the researcher is actually measuring what s/he says s/he is. As an example, let us pretend we want to measure attitude. A psychologist by the name of Kurt Goldstein developed a way to measure "abstract attitude" by assessing several different abilities in brain injury patients, such as ability to separate their internal experience from the external world, ability to shift from one task to another, and the ability to recognize an oragnized whole, to break it into component parts, and then reorganize it as before. Carl Jung defined attitude a introversion and extraversion. Raymond Cattell defined attitude in three components: intensity of interest, interest in an action, and interest in action toward an object (Hall & Lindzey, 1978).

Researchers should be concerned with both external and internal validity. External validity refers to the extent to which the results of a study are generalizable or transferable. (Most discussions of external validity focus solely on generalizability; see Campbell and Stanley, 1966. We include a reference here to transferability because many qualitative research studies are not designed to be generalized.)

Internal validity refers to (1) the rigor with which the study was conducted (e.g., the study's design, the care taken to conduct measurements, and decisions concerning what was and wasn't measured) and (2) the extent to which the designers of a study have taken into account alternative explanations for any causal relationships they explore (Huitt, 1998). In studies that do not explore causal relationships, only the first of these definitions should be considered when assessing internal validity.

Scholars discuss several types of internal validity. For brief discussions of several types of internal validity:

* Face Validity
* Criterion Related Validity
* Construct Validity
* Content Validity


KINDS OF VALIDITY

Face validity requires that your measure appears relevent to your construct to an innocent bystander, or more specifically, to those you wish to measure. Face validity can be established by your Mom - just ask her if she thinks your survey could adequately and completely assess someone's attitude. If Mom says yse, then you have face validity. However, you may want to take this one step further and ask individuals similar to those you wish to study if they feel the same way your Mom does about your survey. The reason for asking these people is that people can sometimes become resentful and uncooperative if they think they are being misrepresented to others, or worse, if they think you are misrepresenting yourself to them. For instance, if you tell people you are measuring their attitudes, but your survey asks them how much money they spend on alcohol, they may think you have lied to them about your study. Or if your survey only asks how they feel about negative things (i.e. if their car was stiolen, if they were beat up, etc.) they may think that you are going to find that these people all have negative attitudes, when that may not be true. So, it is important to establish face validity with your population of interest.

In order to have a valid measure of a social construct, one should never stop at achieving only face validity, as this is not sufficient. However one should never skip establishing face validity, because if you do not have it, you cannot achieve the other components of validity.

Content validity is very similar to face validity, except instead of asking your Mom or your target members of your population of interest, you must ask experts in the field (unless your Mom is an expert on attitude). The theory behind content validity, as opposed to face validity, is that experts are aware of nuances in the construct that may be rare or elusive of which the layperson may not be aware. For example, if you submitted your attitude survey to Kurt Goldstein for a content validity check, he may say you need to have something to assess whether your respondents can break something down into component parts, then resynthesize it, as this is an important aspect of attitude, and otherwise you have no content validity. Many studies procede following content validity acvhievement, however this does not necessarily mean the measures used are entirely valid. Criterion validity is a more rigorous test than face or content validity. Criterion validity means your attitude assessment can predict or agree with constructs external to attitude.

Two types of criterion validity exist:

Predictive validity- Can your attitude survey predict? For example, if someone scores high, indicating that they have a positive attitude, can high attitude scores also be predictive of job promotion? If you administer your attitude survey to someone and s/he rates high, indicating a posotive attitude, then alter that week s/he is fired from his/her job and his/her spouse divorces him/her, you may not have predictive validity.

Concurrent validity- Does your attitude survey give scores that agree with other things that go along with attitude? For example, if someone scores low, indicating that they ahve a negative attitude, are low attitude scores concurrent with (happen at the same time as) negative remarks from that person? High bolld pressure? If you administer your attutude survey to someone who is cheerful and smiling a lot, but they rate low, indicating a negative attitude, your survey may not have concurrent validity.

Finally, the most rigorous validity test you can put your attitude survey through is the construct validity check. Do the scores your survey produce correlate with other related constructs in teh anticipated manner? For example, if your attitude survey has construct validity, lower attitude scores (indicating negative attitude) should correlate negatively with life satisfaction survey scores, nd positively with life stress scores.


CHAPTER TWO

RELIABILITY


In statistics, reliability is the consistency of a set of measurements or measuring instrument. Reliability does not imply validity. That is, a reliable measure is measuring something consistently, but not necessarily what it is supposed to be measuring. For example, while there are many reliable tests of specific abilities, not all of them would be valid for predicting, say, job performance.


THEORY OF RELIABILITY

Reliability is intimately related to the concept of random error of measurement. It is generally accepted that all measurements of human qualities contain some error. When one administers a tes to a students, he secures a score which can be called the observed score. If he had tested this student on some other occasion with the same instrument, he probably would not have obtained the same observed score.


THE RELIABILITY INDICES

There are four commonly used methods for estimating the reliability of a test. Three of them : test retest, equivalent forms, and split half are based on correlational procedures.



KIND OF REALIBILILITY

TEST-RETEST RELIABILITY

An obvious way to estimate the realibility of a test is to administer it to the same group on two occaions and correlate thepaired scores. For measure realibilty of test we can give a test to same subject twice or more in the different time then combine them. For combine the result of both of test, it can used the korelation of pearson product moment. This reliability coefficient, because it is indicative of the consistency of subjects in time, is referred to as the coefficient of stability. A disadvantage of this procedure arises from the fact that human qualities and characteristics are continually changing. It may happen that some of the individuals who take the tests improve their marksmanship through practice, but for some reason the marksmanship of others deteriorates. The test- retest technique, therefore, is not appropriate in any situation in which memory may play a role. It’s use in schools is largely restricted to measures of physical fitness and athletic prowess.


EQUIVALENT-FORMS RELIABILITY

The equivalent- forms technique of estimating reliability, which is also referred to as the alternate- or parallel- forms technique, is used when it is probable that subject will recall their responses to the test items. If the two forms are administered at essentially the same time ( in immediate succession ), the resulting reliability coefficient is called the coefficient of equivalence. If subjects are tested with one form on one occasion and with another form on a second occasion and their scores on the two forms are correlated, the resulting coefficient is called the coefficient of stability and equivalence. The equivalent- forms technique is recommended when one wishes to avoid the problem of recall or practice effect and in cases when one has available a large number of test items from which to select equivalent samples. It is generally considered that the equivalent – forms procedure provides the best estimate of the reliability of academic and psychological measures.

SPLIT-HALF RELIABILITY

It is possible to get a measure of reliability from a single administration of one form of a test by using split half procedures. The test is administered to a group of subjects, and later the items are divided into two comparable halves. Scores are obtained for each individual on the comparable halves and a coefficient of correlation calculated between these two scores.

A problem with this procedure is in splitting the test to obtain two comparable halves. If, through item analysis, one establishes the difficulty level of each item, one can place the items into two groups on the basis of equivalent difficulty and similarity of content. To transform the split-half correlation into an appropriate reliability estimate for the entire test , the Spearman- Brown prophecy formula is employed :


CHAPTER III

STEPS TO INCREASE VALIDITY AND REALIBILITY


Ways for increase validity and realibility on quantitatifi research. There are 6 srep that have to consideration in order to yhe instrument is used in research can fulfil the element of validity and realibility :
1. Look out the aspect that develop in that instrument
2. Developing the questions
3. Collecting the data
4. Perfecting the instrument
5. Looking for reability with new data
6. Looking for construct validity

Ways for increase validity and reability on qualitatif triangulasi is combine of two method or more on collecting the data, it is used to increase the data or make a appropriate conclusion.

There are many kind of triangulasi “ Cohen and Manion (1980)

1. Time triangulasi
2. Place triangulasi
3. Theory triangulasi
4. Method triangulasi
5. Research triangulasi



CONCLUSION

Validity refers to the degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure.

Scholars discuss several types of internal validity. For brief discussions of several types of internal validity:

* Face Validity
* Criterion Related Validity
* Construct Validity
* Content Validity

reliability is the consistency of a set of measurements or measuring instrument. Reliability does not imply validity.


KINDS OF REALIBILILITY

1. Test-retest reliability
2. Equivalent-forms reliability
3. Split-half reliability


Ways for increase validity and realibility on quantitatifi research

1. Look out the aspect that develop in that instrument
2. Developing the questions
3. Collecting the data
4. Perfecting the instrument
5. Looking for reability with new data
6. Looking for construct validity


Ways for increase validity and realibility on qualitative research

1. Time triangulasi
2. Place triangulasi
3. Theory triangulasi
4. Method triangulasi
5. Research triangulasi



REFERENCES



AG. BAMBANG SETIYADI 2006. Metode Penelitian Untuk Pengajaran Bahasa Asing Pendekatan Kuantitatif dan kualitatif. Graha Ilmu. Yogyakarta.


DONALD ARY 1972. Introduction To Research In Education.USA




Blog Archive