|
History
Although Aristotle was Plato’s most distinguished student,
he did not lead Plato’s Academy after Plato, but rather founded
his own school, the Lyceum. This was probably because of the great
dispute between the teacher and his student. Both Plato and Aristotle
agreed that knowledge, episteme, was perfect, unchanging
and categorical. Both also agreed that the world of experience --
the world in which we all live -- is in constant flux. The great
sophist Heraclites said that you can’t step in the same river
twice, meaning that every aspect of the river changes between the
first step and the second. He believed that the only reality was
change.
Heraclites’ great rival, Parmenides, believed the opposite.
The Greeks embraced two philosophical principles very strongly:
the first is the principle of non-contradiction, which says that
an entity can’t be and not be the same thing at the same time.
A tomato might be green, or it might be red, but it can’t be
red and green at the same time. The other is the principle
of causality, which says that nothing can come from nothing, nor
can anything pass into nothing.
Based on these principles, Parmenides argued that change was impossible,
and the flux apparent in our experiences must be an illusion. If
a green tomato ripened, for example, Parmenides asked where did
the red tomato come from? And where did the green tomato go? Green
can’t be the cause of red, because the effect has to be like
its cause. The green tomato can’t be annihilated, because nothing
can pass into nothing, and the red tomato can’t come from nothing,
so this change can’t actually happen -- it must be an illusion.
Plato tended to agree with Parmenides. (In Plato’s dialogues,
Socrates wins in every case but one -- his dialogue with Parmenides.
But Plato opines that Socrates was young at the time.) He, too,
regarded the world of experience to be illusion, a source of error
to be avoided. The true source of knowledge was not experience,
but the World of Ideas, and the proper road to true knowledge was
to lead it out (e ducere) of your own mind, where it lies, forgotten,
from our original sojourn in that world.
Aristotle disagreed with Plato about this. He suggested that the
world of experience was made up of two principles: primary matter
and substantial form. Primary matter is the basis of an entity’s
existence, but its substantial form that makes it what it is. Change
is simply the process by which restless primary matter casts off
and takes on forms. It is the forms which we come to know by abstraction
from our experience. Thus Aristotle disagrees with Plato about his
most important belief: he believes knowledge comes from experience,
and advocates examining the world of experience to gain knowledge.
But Aristotle has not yet solved Parmenides’ dilemma. He needs
to account for the succession of forms. Where do the forms come
from, and where do they go? According to the view that Aristotle,
Plato, Parmenides and the rest accepted, the forms must be caused
by something, since nothing can come to be from nothing, and the
cause must contain the form in it in some way, since a cause must
be like its effect.
Aristotle solves this problem by asserting that the form or essence
of any entity contains within it from the first moment of its existence
all the forms through which it will pass during its existence. They
are there potentially (in potency), and come to be actually there
(in act) when they are realized. An acorn contains leaves, bark
and the rest of the attributes of an oak tree in potency, but the
oak tree has them actually (in act). The form (essence) of a tomato
contains both the forms “green” and “red”, first
red in potency and then red in act.
This solves the problems of causality and non contradiction for
any given entity, but does not solve the problem entirely. Aristotle
also needs to solve the problem of the succession of things themselves.
If the child was caused by its parents, and its parents by their
parents, what does the entire sequence depend on? Aristotle solves
this problem by positing an “uncaused cause,” an entity
existing for all time which contains all the forms that will ever
be. This preexisting set of forms is what Aristotle calls “final
causes,” the preexisting ideas of what a thing is
to become.
Aristotle’s concept of motion follows this model exactly. When
an object is at rest, it is in a place, but it is potentially in
another place; it is in one place “in act” and another
place “in potency.” Once it has moved it is in the new
place “in act.” When it is moving, however, it is not
in any place. For Aristotle, motion is “the act of a being
in potency insofar as it is in potency.” The object contained
its final resting place “in potency” as part of its substantial
form from the first moment of its existence, and intended to go
there. Heavy objects “belong” at the center of the earth
and try to go there; light objects, such as fire, want to be at
the periphery of the world, and try to go there. The impetus or
cause of their motion lies within them. An Aristotelian explanation
of motion consists of understanding the state of the object before
and after the motion, but does not focus on the moving itself.
Is it possible that the inadequacy of the Aristotelian concept of motion
made it impossible for analysts to see it precisely? Despite Galileo’s extraordinary ingenuity, precise description of
complicated motions had to await the development of Newton and Leibnitz’ new language, calculus, which made it possible to describe motions
as precisely as desired, if not perfectly.
In advocating careful and extensive study of the world of experience,
Aristotle strongly supports the development of science, but the
categorical, absolute character of Aristotle’s view of
knowledge is fundamentally different from knowledge as science understands
it. Science rejected Aristotle’s idea of the “final cause” in describing motion, and explosive progress followed quickly.
The Social Sciences
Not all the sciences rejected Aristotelian thinking. Just as Aristotle
believed a rock needed to know where it wanted to go before it went,
he believed that humans needed to intend to do whatever they did.
If I go to the store, it’s because I intended to go to the
store before I left; if I eat a banana, the intention to eat the
banana must preexist my act. For Aristotle, behavior or action is
“the act of a being in potency insofar as it is in potency,”
and not analyzed further. The action is not the focus of inquiry,
but the state of the person before and after the action. Action
or behavior is a series of indistinct blurs between states of being.
While it is amazing that Aristotle’s theory of motion was held
in the face of overwhelming evidence for 2000 years by very intelligent
and thoughtful scholars, it is perhaps even more amazing that Aristotle’s
theory of human behavior is still held by most social scientists
2400 years later.
Rejecting Aristotle is not easy. In the first half of the 20th century,
Alfred
Korzybsky developed a theory strongly critical of
Aristotle’s methods which became very popular, and led to the
foundation of the discipline of general
semantics. Korzybsky’s work focuses on the discrepancy
between the flux of our experiences versus the ideal, static, categorical
structure of language, and warns that failure to realize the crude
approximation that language provides for experience blinds one to
seeing and understanding. Korzybsky’s work had led to diverse
followings, including academic scientists like S.
I. Hayakawa, cult like movements such as noology,
and literary works, particularly those by A.
E. van Vogt.
As with most debate cases, the need is strong but the plan is weak.
Korzybsky and his followers do a convincing job of showing that
the continuous, flowing world of experience is inadequately represented
by the categorical Aristotelian language we use to describe it,
but provide no systematic procedure for overcoming the problem.
Even though Gilbert Gosseyn, the hero of A.E. van Vogt’s provocative
novels, calls on the power of non-aristotelian philosophy and methods,
his success is attributable much more to his extra brain, which
gives him the ability to teleport across galactic distances, and
a stash of extra bodies which come to life in succession as each
one old one is killed.
The Categorical Character of Social Science
While the physical sciences were making giant strides after abandoning
the categorical model of Aristotle for the comparative model of
Galileo, the social scientists remained steadfastly categorical in their thinking. In psychology, the basic model of human behavior
remained categorical and intentional, with prior states of mind,
such as attitudes, wishes, motives, needs, or other psychological
states providing an impetus which led to a behavior, which then
resulted in an end state. Many psychologists -- perhaps a great
majority early in the 20th century -- believed these motives were
built in genetically from the first moment of a person’s existence.
In sociology and anthropology, the most common general theory believed
that society had a structure which consisted of statuses,
which where distinct “locations” in the society (“status”
is Latin for “place.”) All the statuses were arranged
in a hierarchy. Karl Marx recognized three distinct levels of stat
us: the rulers, the working class, and a middle class that was doomed
to be driven down into the working class. Max Weber opined that
there were three parallel situses in the hierarchy, with stratification
based on wealth, status and power.
Anthropologists differed among
themselves as to how many classes there were. Some recognized three,
lower, middle and upper; some added “working class” to
make a four tier system. Lloyd
Warner identified six. In any event, each of the
classes was thought of as a discrete, categorical thing. Mobility meant the ability of an individual to move from the class
in which s/he was born to another. In an open society, the boundaries
between classes were thought to be porous, and mobility occurred
frequently; in a closed society, boundaries were rigid and mobility
rare.
Some
sociologists, however, conceived of status as a continuous,
comparative, quantitative variable.
Archie O. Haller, a University of Wisconsin sociologist
with an engineering background, suggested a modern comparative model
of stratification. Haller did not conceive of social mobility as
the discrete change of an individual from one status to another,
but rather as a lifetime trajectory, as a point moving in a continuous
stratification space. Along with William
Sewell and Alejandro
Portes, he published a model of the status attainment
process which suggested that the trajectory of individuals through
the status hierarchy was determined by their continuously changing
aspirations, which were themselves influenced by the expectations
of their significant
others. Their findings indicated support for the
model, but their results were attenuated by the poor quality categorical
measurements in the secondary data that was available to them. Haller
designed a new study to develop superior instrumentation, which
resulted in the Wisconsin Significant Other Battery (WISOB), a set
of questionnaires which identified the most significant others for
adolescent children, and measured the aspirations of the children
and the expectations of their significant others.
The WISOB was itself a categorical device, in which adolescent children
were asked to name the people who communicated most with them in
each of four categories. The measurement of educational and occupational
aspirations and expectations, however, had some comparative characteristics.
Level of educational aspirations and expectations were measured
by asking students how far they planned to go through school and
by asking significant others how far they expected the adolescent
to go through school. Although the answers were recorded categorically
(e.g., some high school, finish high school, etc.) they corresponded
roughly to years of schooling, which is comparative.
Level of occupational
aspirations and expectations were calculated by asking students
what specific jobs they expected to be able to get, and by asking
significant others what specific jobs they expected the child to
be able to get. The level of occupational prestige of each of these
jobs was recorded based on the NORC
Occupational Prestige scale, a quasi comparative
scale with approximately a 90 point range. Scores of all the jobs
for a single adolescent were averaged to provide an estimate of
the level of occupational aspiration and expectation.
An important part of the model to be tested hypothesized that the
aspirations of the adolescent respondents would be strongly influenced
by the expectations of their significant others, following Mead,
Sullivan
and others. Since the significant others were identified by the
adolescents rather than preselected by the investigators, respondents
differed in the number of significant others they reported. This
produced a difficult analytic situation since there are no traditional
multivariate analysis methods that allow a different number of variables
per case. After much study and consultation, the investigators (at
this point Haller, Joseph
Woelfel and Edward
L. Fink) decided to calculate the average expectations
of all significant others for each respondent, and use this average
as an indicator of the expectations. This turned out to be a very
good predictor of the respondents’ aspirations -- by far the
best in the literature by a very large margin.
No one at the Significant Other Project ("other" than what, you might
ask?) had any theoretical justification for choosing the mean, but
chose it solely as a heuristic to overcome the problem of different
numbers of variables per case. After the fact, however, the results
seemed very reasonable. If each individual significant other’s
expectation could be thought of as a force acting on the individual’s
aspiration, then the mean of all those forces would represent a
balance point where the net force was zero.
Now at the University
of Illinois, Woelfel, along with John Saltiel, Donald
Hernandez, and Curtis Mettlin (with some help from Ken Southwood) worked
out the algebra of the force model for the one dimensional case,
and Hunter, Danes, & Woelfel provided experimental evidence that this model fit
observations better than alternative plausible models. This work,
generally referred to as a theory of linear force aggregation, resulted
in a series of publications showing that attitudes of respondents
tended to lie near the weighted average of the expectations of significant
others, controlling for important social structural factors.
The space of occupations
Despite the strong support for the averaging model, a major problem
remained: it only applied to attitudes that could be measured on
a comparative scale. It was possible to take the average of the
occupational prestige of several occupations, or the average number
of years of education, or the average number of radical activities,
or the average number of marijuana cigarettes smoked per day, but
just what is the average of Doctor and Airline Pilot?
The averaging model could not be used for discrete choices: If your
mother expects you to be a doctor and your father expects you to
be an airline pilot, just what is the average? This problem could
be solved if each discrete object, such as an occupation, could
be represented as a point in space, close to other objects that
are like it, and far from other objects which are different.
Several spatial representations of social and psychological data
were known. L.L.Thurstone conceived of psychological content in
spatial terms. He conceived of attitudes as “positions”
in a mathematical space, and his scaling procedure involved starting
with large pools of such positions and sorting them into piles until
a final, reduced set of positions lying at approximately equal intervals
remained to form the scale. In his study of human intelligence,
he believed that the measured values on intelligence tests were
a function of a smaller set of “factors” that represented
aspects of mental ability. These “factors” he thought
of as a bundle of inter correlated vectors in a vector space of
relatively low dimensionally, and he developed procedures for identifying
these factors by extracting the eigenvectors of the matrix of intercorrelations
among test items. It’s important to understand that the central
goal of factor analysis was to find a vector space of considerably
lower dimensionally than the order of the data: any procedure that
did not reduce the dimensionally of the data would be a failure
for Thurstone’s purposes.
Thurstone’s factor analysis was a vector space within which various “factors” underlying mental ability were arrayed as generally
correlated vectors, but Thurstone’s factor space had some difficulties:
most important was the standardization of the data in the form of
correlation coefficients that made the factor space into a unit
hypersphere. Moreover, at the time Thurstone developed factor analysis,
the computer had not yet been invented, and factor analyses had
to be done by hand, a laborious procedure involving dozens of graduate
students laboring weeks. Because of this labor intense procedure,
Thurstone developed rules of thumb for determining when “enough” factors had been extracted, which led to the common practice of
presenting a smaller dimensional solution that did not completely
represent the data.
Subsequent practice, in which workers were interested only in the
items which had the highest numerical coordinates (“factor
loadings”) led to the common practice of deleting any coordinates
whose value fell below plus or minus .4. The result was a “factor
space” that did not actually represent the raw data well, and,
in fact, the original correlation matrix could not be regenerated
from the matrix of factor loadings, nor could the original scores
be reproduced from the correlation matrix. Thus began the curious
practice, common to 20th century psychometrics, of compromising
the data to fit preconceived notions of what the resulting space
“ought” to look like. This practice can be attributed
to the Platonic notion that true or correct ideas must have a specific,
perfect form, and the world of experience could only be a source
of distorted and erroneous perceptions.
Not only was the dimensionality of the space expected to be small,
but the dimensions were also supposed to represent some latent factor
or trait. Osgood’s
semantic differential space, which was popular for
a brief period in the early second half of the 20th century, was
also a unit sphere restricted to three named dimensions, which were
always expected to be three orthogonal attributes: good-bad, active-passive,
and strong-weak, but it’s bipolar measurement system and extensive
list of “degenerate” attributes which would not fit into
the hypothesized three dimensions were problems. Research using
the methods of the semantic differential showed that many attributes
could not be made to fit into the three dimensional unit sphere
that was the semantic differential space, so these were set aside
to a list of “degenerate” attributes which were proscribed
from use.
In 1938, Young
and Householder identified an exact solution to the
problem of defining a spatial coordinate system from a matrix of
inter point distances. This solution, slightly modified by Warren
Torgerson, was presented to Psychology in his 1958 textbook, Theory and Method of Scaling, under
the name “multidimensional scaling,” but quickly ran into problems.
When given high quality paired comparison data from actual empirical
measurements, results of the Young-Householder-Torgerson method
were usually both high dimensional and non-euclidean.
The high dimensionally was indicated by a large number of eigenvectors
of substantial length, and the non-euclidean character was revealed
by the fact that several of these eigenvectors were imaginary, with
corresponding negative eigenvalues.
20th century psychometricians were alarmed by these two characteristics,
(although no reasons were ever given for why the space of human
cognition ought to be euclidean and low dimensional), and sought
to find ways to “correct” the solution. Once again, in
the Platonic tradition, psychometricians assumed that the measurements
themselves were inherently untrustworthy, and that the high dimensionally
and non euclidean character of the space were the result of measurement
error. The belief that human measurements were inherently very crude
also led to the belief that the only useful result of developing
a space of cognition was to produce two dimensional pictorial maps
that would give investigators and intuitive picture of the overall
structure of the space. The idea of the space as an inertial reference
frame within which cognitive processes might be precisely represented
was not present in the psychometric literature of the time.
Attneave (1954) suggested that the paired comparison scales might
not be trusted to have a true zero point, and suggested finding
a smallest constant number (the “additive
constant”) which could be added to every measurement
to make the space euclidean. This procedure, however, still left
the dimensionally of the space high, which most psychometricians
found uncomfortable. Another procedure known at the time was adding
the smallest (largest absolute value) negative eigenroot to every
eigenroot, which would leave all eigenroots positive, then renormalizing
the eigenverctors to their new eigenroots. This eliminated all traces
of non-euclideanism from the space, but sill left a high dimensional
solution. What’s more, the original measurements differed from
the values regenerated from the newly scaled coordinates by large
margins.
Roger
Shephard and Joseph Kruskal independently developed
a “solution” to the problem of high-dimensional non-euclidean
spaces. Since, following Plato, the data provided by measurements
would be grossly distorted, or, following Aristotle, the data provided
by human measurements would be of a much lower order of precision
that that provided by physical experience, the data themselves should
be of secondary importance, and the more important component of
understanding must come from the philosophical appeal of certain
absolute, beautiful forms. Therefore, we may feel free to modify
our measured data until they conform to the ideal space, which should
be of low (two or three) dimensions and euclidean. Curiously, these
writers assign one aspect of experience an inviolable certainty:
they all assume that the ordinality of measurements are trustworthy
and must not be violated. Given this stipulation, investigators
are free to adjust the measured values in any way and by any amount
until they it into a euclidean space of pre specified dimensionally
(usually 2 or 3) as long as the order of the original measurements
is not violated. A number, usually Kruskal’s stress or a variant
thereof, is then calculated to assess the degree to which the final
solution violates the ordinality of the original measurements. This
new method of “non-metric
multidimensional scaling” had a major impact
on the field of psychometrics, and, for a long time, almost completely
eclipsed the use of classical Young, Householder Torgerson methods.
Why, in the confusing world of Plato’s sense experiences, or
Aristotle’s world in which human data can only be perceived
to much broader tolerances than physical things, should these psychometricians
find the ordinal relations of perception to be reliable data? This
is probably due to S.
S. Stevens’ fourfold classification of measurement
as nominal, ordinal, interval and ratio, a taxonomy accepted as
an article of faith by virtually
every textbook in the social sciences.
Within this taxonomy, the lowest form of measurement is nominal,
in which objects of perception can be named. At the second, ordinal
level, the objects can be placed in rank order in terms of some
attribute, but the exact intervals among them cannot be ascertained.
At the third level, perceptual objects can be placed in an order,
and the exact intervals among them can be established, but the location
of a true zero, that is, a point at which none of the attribute
exists, is unknown. Finally, at the highest level of measurement,
the exact intervals among perceptual objects can be established,
and their distances from a true zero point is known.
Stevens’ taxonomy is not derived from principles, nor based
on data, but relies entirely on intuition or common sense. Like
Aristotle’s law of falling bodies, however, Stevens’ classification
is incorrect, and won’t stand up under very simple
scrutiny. Of course, measurements can be classified into four categories,
since any set of perceptions can be classified into any number of
categories arbitrarily, but whether these categories themselves
are an ordinal scale is open to question. Is an interval scale “higher” than an ordinal scale?
The psychometric literature makes it seem as if this is too obvious
to require proof, either formal or empirical, but in fact, the ordinal
property of a distribution of values is only more robust than the
metric values in the case of one special kind of distribution: one
in which the values are very sparse and widely distributed. When
this is the case -- and only when this is the case -- large changes
in the values of elements in the distribution will leave the rank
orders unchanged. In other kinds of distributions, including those
with dense distributions with many elements close to each other
in value, or in data with extensive symmetry, such as the distances
among the features on a human face, very slight changes in the values
of the elements will produce very large changes in the rank orders.
These are not rare cases, but probably typical of the most common
kinds of data. Deciding whether you prefer chocolate to vanilla
may be simple, and perhaps easier then deciding how much you prefer
one to the other, but placing
flavors into rank order is much more difficult --
even more difficult than assigning each flavor a numerical favorability
rating.
Not surprisingly, non-metric multidimensional scaling, which once
almost completely eliminated classical procedures from the field,
has already run into serious trouble, and even the leaders of the
non-metric movement now suggest that the classical Young Householder
Torgerson procedure is often -- perhaps even usually -- better.
The Galileo Group at the University of
Illinois
The Galileo Group at Michigan State
The Galileo Group at Albany
The Galileo Group at Buffalo
The Galileo Group at The East West Center
But what good is it?
It’s important to remember that the original goal of factor
analysis, the semantic differential, and multidimensional scaling
was to find a space of low dimensionality spanned by a small set
of vectors that represented meaningful psychological attributes.
The Galileo developers, however, had no interest in this. The original
goal was specifically to establish a coordinate reference frame
that could be used as a mathematical aid for describing cognitive
processes such as attitude and belief changes over time. The fact
that strong evidence indicated that the space of cognitive processes,
when measured with ratio-level paired comparisons and using exact
rather than approximate scaling algorithms, was high dimensional
and non euclidean was of no consequence. The philosophy behind Galileo
is straightforward and consistent: 1) measure as precisely as possible,
2) introduce no distortion into the analysis at any point, and 3)
accept the results as they are. Following this philosophy rigorously
generally produces high dimensional, non euclidean spaces. But what
are they good for, and why would anyone want to make one?
The usefulness of Galileo space is that events in the space correspond
to events of interest in experience. Each of the points in a Galileo
space represents a social object, following Mead, and such objects
can be, as Blumer notes, “...anything that can be designated
or referred to.” The self is an object in this system, and
can be positioned in a Galileo space. Behaviors are also objects,
and can be arrayed in the same space. Wisan’s dissertation
supported the hypothesis that behaviors that are performed frequently
(e.g., walking, sitting) lie closer to the self point in a Galileo
space than do behaviors that are performed infrequently or rarely
(e.g., marrying, fighting). In fact, between repeated administrations
of the behavior questionnaire, in response to the US
invasion of Cambodia and the killing of four students
at Kent State University, national
guard forces were sent in to the University of Illinois campus.
At the next administration, fighting moved considerably closer
to the self point in the map, while revolution moved even
farther away from the self.
Galileo lends itself well to time series measurement and experimental
research, because it makes it possible to project measurements made
at different times onto the same coordinates, and because the algorithm
behaves identically every time. The non metric scaling algorithms
are seldom seen in time series or experimental research, because
their iterative approximation interacts non-linearly with data and
thus treats data differently in each session, and the merely ordinal
character of the data is not strong enough to show changes over
time with meaningful precision.
Ideal
Point |
|