Categorical Variables in Developmental Research - Methods of Analysis

Categorical Variables in Developmental Research - Methods of Analysis

von: Alexander von Eye, Clifford C. Clogg (Eds.)

Elsevier Trade Monographs, 1996

ISBN: 9780080528717 , 286 Seiten

Format: PDF, ePUB, OL

Kopierschutz: DRM

Windows PC,Mac OSX geeignet für alle DRM-fähigen eReader Apple iPad, Android Tablet PC's Apple iPod touch, iPhone und Android Smartphones Online-Lesen für: Windows PC,Mac OSX,Linux

Preis: 112,00 EUR

Mehr zum Inhalt

Categorical Variables in Developmental Research - Methods of Analysis


 

1

Measurement Criteria for Choosing among Models with Graded Responses


David Andrich    Murdoch University, Western Australia

1 INTRODUCTION


It is generally accepted that measurement has been central to the advancement of empirical physical science. The prototype of measurement is the use of an instrument to map the amount of a property on a real line divided into equal intervals by thresholds sufficiently fine that their own width can be ignored, and in its elementary form; this is understood readily by young school children. However, the function of measurement in science goes much deeper: it is central to the simultaneous definition of variables and the formulation of quantitative scientific theories and laws (Kuhn, 1961). Furthermore, when proper measurement has taken place, these laws take on a simple multiplicative structure (Ramsay, 1975).

Although central to formulating physical laws, it is understood that measurement inevitably contains error. In expressions of deterministic theories, these errors are considered sufficiently small that they are ignored. In practice, the mean of independent repeated measurements, the variance of which is inversely proportional to the number of measurements, can be taken to increase precision to a point where errors can indeed be ignored. It is also understood that instruments have operating ranges; in principle, however, the measurement of an entity is not a function of the operating range of any one instrument, but of the size of the entity.

Graded responses of one kind or another are used in social and other sciences when no measuring instrument is available, and these kinds of graded responses mirror the prototype of measurement in important ways. First, the property is envisaged to be continuous, such as an ability to perform in some domain, or an intensity of attitude, or the degree of a disease; second, the continuum is partitioned into ordered adjacent (contiguous) intervals, usually termed categories, that correspond to the units of length on the continuum. In elementary treatments of graded responses, the prototype of measurement is followed closely in that the successive categories are simply assigned successive integers, and these are then treated as measurements. In advanced treatments, a model with a random component is formalized for the response and classification processes, the sizes of the intervals are not presumed equal, and the number of categories is finite.

This chapter is concerned with criteria, and the choice of a model that satisfies these criteria, so that the full force of measurement can be exploited with graded responses of this kind. The criteria are not applied to ordered variables where instruments for measurement already exist, such as age, height, income expressed in a given currency, and the like, which in a well-defined sense already meet the criteria. Although one new mathematical result is presented, this chapter is also not about statistical matters such as estimation, fit, and the like, which are already well established in the literature. Instead, it is about looking at a relatively familiar statistical situation from a relatively nonstandard perspective of measurement in science.

Whether a variable is defined through levels of graded responses that characterize more or less of a property, or whether it is defined through the special case of measurement in which the accumulation of successive amounts of the property can be characterized in equal units, central to its definition is an understanding of what constitutes more or less of the property and what brings about changes in the property. It is in expressing this relationship between the definition and changes in terms of a model for measurement that generalizes to graded responses, and in articulating a perspective of empirical enquiry that backs up this expression, that this chapter contributes to the theme of the analysis of categorical variables in developmental research.

2 MEASUREMENT CRITERIA FOR A MODEL FOR GRADED RESPONSES


In this section, three features of the relationship between measurement and theory are developed as criteria for models for measurement: first, the dominant direction of the relationship between theory and measurement; second, the structure of the models that might be expected to apply when measurements have been used; and third, the invariance of the measurement under different partitions of the continuum. These are termed measurement criteria., It is stressed that in this argument, the criteria established are independent and a priori to any data to which they might apply. In effect, the model chosen is a formal rendition of the criteria; it expresses in mathematical terms the requirements to which the graded responses must conform if they are to be like measurements, and therefore the model itself must exhibit these criteria., Thus the model is not a description of any set of data, although it is expected that data sets composed of graded responses can be made to conform to the model, and that even some existing ones may do so. Moreover, the criteria are not assumptions about any set of data that might be analyzed by the model. If data collected in the form of graded responses do not accord with the model, then they do not meet the criteria embedded in the model, but this will not be evidence against the criteria or the model. Thus it is argued that graded responses, just like measurements, should subscribe to certain properties that can be expressed in mathematical terms, and also that the data should conform to the chosen model and not the other way around; that is, the model should not be chosen to summarize the data. This position may seem nonstandard, and because it is an aspect of a different perspective, it has been declared at the outset. It is not, however, novel, having been presented in one form or another by Thurstone (1928), Guttman (1950), and Rasch (1960/1980), and reinforced by Duncan (1984) and Wright (1984).

2.1 Theory Precedes Measurement


Thomas Kuhn is well known for his theory of scientific revolutions (Kuhn, 1970). In this chapter, I will invoke a part of his case, apparently much less known than the revolutionary theory itself, concerning the function of measurement in science (Kuhn, 1961) in which he stands the relationship between measurement and theory as traditionally perceived on its head:

In text books, the numbers that result from measurements usually appear as the archetypes of the "irreducible and stubborn facts" to which the scientist must, by struggle, make his theories conform. But scientific practice, as seen through the journal literature, the scientist often seems rather to be struggling with the facts, trying to force them to conformity with a theory he does not doubt. Quantitative facts cease to seem simply "the given." They must be fought for and with, and in this fight the theory with which they are to be compared proves the most potent weapon. Often scientists cannot get numbers that compare well with theory until they know what numbers they should be making nature yield. (Kuhn, 1961, p. 171)

Kuhn (1961) elaborates the … "paper's most persistent thesis: The road from scientific law to scientific measurement can rarely be traveled in the reverse direction" (p. 219, emphasis in original).

If this road can be seldom traveled in the physical sciences, then it is unlikely to be traveled in the social sciences. Yet, I suggest that social scientists attempt to travel this route most of the time by modeling available data, that is, by trying to find models that will account for the data as they appear. In relentlessly searching for statistical models that will account for the data as given, and finding them, the social scientist will eschew one of the main functions of measurement, the identification of anomalies:

To the extent that measurement and quantitative technique play an especially significant role in scientific discovery, they do so precisely because, by displaying serious anomaly, they tell scientists when and where to look for a new qualitative phenomenon. To the nature of that phenomenon, they usually provide no clues. (Kuhn, 1961, p. 180)

And this is because

When measurement departs from theory, it is likely to yield mere numbers, and their very neutrality make them particularly sterile as a source of remedial suggestions. But numbers register the departure from theory with an authority and finesse that no qualitative technique can duplicate, and that departure often is enough to start a search. (Kuhn, 1961, p. 180)

Although relevant in general, these remarks are specifically relevant to the role of measurement and therefore to the role that graded responses can have. Measurement formalizes quantitatively and efficiently a theoretical concept that can be summarized as a variable in terms of degree, similar in kind but greater or lesser in intensity, in terms of more or less, greater or smaller, stronger, or weaker, better or worse, and so on, that is to be studied empirically. If it is granted that the variable is an expression of a theory, that is, that it is an expression of what constitutes more or less, greater or smaller, and so on, according to the theory, then when studied empirically, it becomes important to invoke another principle of scientific enquiry, that of falsifiability (Popper, 1961). Although Popper and Kuhn have disagreed on significant aspects of the philosophy...