This essay demystifies the concept of validity and reliability, citing the differences between the two and explaining which among the two is more important. It also lays down the various ways of assessing each one of them and also the different ways of obtaining validity evidence. Validity and reliability are two concepts that seem to be synonymous because jointly, they explain the overall quality of assessment, however they do not mean the same thing. When explained in a technical manner, they do not actually mean the same thing. By definition, validity is the extent or measure to which results or output back interpretation test scores through procedural facts. It basically refers to whether any tests or procedures done really measure what it claims to measure (Garson, 2016).
It is usually more of a theoretical oriented issue. When one says an assessment is valid, then it means it has measured that which it was intended to measure. Whereas reliability is a guide that estimates consistency of result or scores. It concerns the range to which any procedure of measurement or test produces similar results on numerous trials of the same nature. When we say an assessment is reliable then it means it gives the same results under similar or identical circumstances. In summary, validity is focused on how strong an outcome will be while reliability is more of the consistency of the outcome. It is easier to determine reliability while validity has more analysis before the outcome. There are four types or modes of validity, that is, internal, conclusion, contrast and external while reliability by internal consistency and tests. When the two are critically analysed, reliability comes out as more important than validity. This is because it is a precondition to validity in that a person cannot base any decisions on outcomes that are not valid. Any outcome has to be reliable first for it to be valid.
Assessment of reliability and validity marks the beginning of the process to comprehend the complex issues related to measurement in applied as well as theoretical settings of research. According to Markus & Borsboon (2013), there are four basic methods or ways of assessing reliability these are; the split halves method, internal consistency method, the alternative form approach and finally the retest methodology. Essential to note is that, in the split half approach, tests can be conducted only once or on single occasion. The implication is that, a total or whole set of items can be separated into halves before performing correlation of scores to find an estimate of reliability. The resulting halves can be considered approximation to the alternative forms. Secondly, in the internal consistency method, there is no need of splitting or dividing of items. The test is done on a group. As a group, the coefficient are termed as internal consistency measures. This technique involves administration of single or one test and thus offer an exclusive estimation of reliability. Next is the alternative form method (Kalwani & Silk, 2011). This is used majorly in scholarly examinations to estimate the reliability of all kinds of tests. It almost shares similarity with the retest test in the sense that it, also needs dual testing circumstances that involves same individuals. The only variation, however, is that a similar test is not availed on the subsequent testing; instead, the subsequent testing involves administration an alternative from of test. The purpose of the two tests are to measure the same thing. Finally is the retest method. This is the easiest way of assessing reliability. In this approach, a similar test is administered to the people in a consistent manner for a given period. Thereafter, one obtains the relationship between the scores either of the two or more administration of the same test.
There are several types of validity or modes of assessing. These techniques include; criterion validity (which entail simultaneous validity as well as predictive validity), contrast validity, content validity and face validity. For criterion validity, based on the difference in time frames, they are of two kinds. That is, Concurrent or simultaneous validity where individuals should be distinguished by measures, whether one would be considered as being good for a job, or would not, and predictive validity, which suggests that predictions made on the basis of the assessment output will automatically valid. This type of validity is most used or impotent where the main reason for assessment is selective. A good example a students’ good performance in A-levels in a certain subject, say mathematics for example. It is predictively valid that the same student will perform well in mathematics at the university (Kalwani & Silk, 2011). Second is contrast validity. This is essentially how closely the assessment relays to the field that one wishes to assess. Contrast validity is utilized when evaluating the validity of empirical measures as far as theoretical concepts are concerned. The main concern of contrast validity entails the degree of particular measures in relation with other measures in a manner that agrees with the hypotheses that are derived in a theoretical manner as regards the concepts under measurement. In contrast validation, there is need for specifying the relationship between the concepts including the empirical relationship under examination and finally the empirical evidence interpreted on its clarity to particular measures. The third type of validity is the content validity also known as curricular validity. This type of validity plays an important role in the development of education curricular and assessments of various types of tests. Basically, this type of validity relies on the degree to which a practical measurement reflects a particular content sphere. Ensuring content validity means ensuring that the objective of learning of the content is closely related to the expected outcome of a successful entity. Finally is face validity. Though not so common, this is described as the “face value”. It looks at the content and comes to agreement that the measure of the concept being measure is just on the face of it. That means, it evaluates whether each of the measuring items matches any given conceptual field of the concept.
There are several ways of obtaining validity evidence. These include; testing of content, response processes, association with other variables and internal structure of tests. First, is the testing of content. This validity evidence recounts the match between the content that should be included in the test and the actual content of a test. Should there be need to interpret a test as a measure of a specific contrast or content, then the content or contrast should show its particular aspects (Kalwani & Silk, 2011). The second one is the test’s internal structure. This refers to the manner in which parts of the tests are relate to one another. A good illustration is where certain tests will include items whose correlation with each other high while other tests involve items, which only fall either into two or more groups. Thus, a vital issue of validity entails the match or correlation the between a test’s actual internal structure of a test and the kind of structure, which a test requires holding. Third is response processes. This entails the match or correlation between the processes or procedures, which respondents require utilizing and the psychological processes that should actually be embraced. Most measures have their basis on assumptions regarding the psychological processes embraced by people during completion of the measure. The final one is the association with other variables. This entails the relationship that exist between other variables and test scores. The modern-day consideration of validity acknowledges the theoretical understanding of the contrast, which is essential in interpretation of test scores (Garson, 2016).
In conclusion, validity and reliability are two terms or concepts that look similar, but actually do not have the same meaning. They are in fact different concepts when explained in a technical terms. These two terms are often used on scholastic works like, term papers, thesis studies, research papers and the likes.